Files
gh-lyndonkl-claude/skills/information-architecture/resources/methodology.md
2025-11-30 08:38:26 +08:00

495 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Information Architecture: Advanced Methodology
This document covers advanced techniques for card sorting analysis, taxonomy design, navigation optimization, and findability improvement.
## Table of Contents
1. [Card Sorting Analysis](#1-card-sorting-analysis)
2. [Taxonomy Design Principles](#2-taxonomy-design-principles)
3. [Navigation Depth & Breadth Optimization](#3-navigation-depth--breadth-optimization)
4. [Information Scent & Findability](#4-information-scent--findability)
5. [Advanced Topics](#5-advanced-topics)
---
## 1. Card Sorting Analysis
### Analyzing Card Sort Results
**Goal**: Extract meaningful patterns from user groupings
### Similarity Matrix
**What it is**: Shows how often users grouped two cards together
**How to calculate**:
- For each pair of cards, count how many users put them in the same group
- Express as percentage: (# users who grouped together) / (total users)
**Example**:
| | Sign Up | First Login | Quick Start | Reports | Dashboards |
|--|---------|-------------|-------------|---------|------------|
| Sign Up | - | 85% | 90% | 15% | 10% |
| First Login | 85% | - | 88% | 12% | 8% |
| Quick Start | 90% | 88% | - | 10% | 12% |
| Reports | 15% | 12% | 10% | - | 75% |
| Dashboards | 10% | 8% | 12% | 75% | - |
**Interpretation**:
- **Strong clustering** (>70%): "Sign Up", "First Login", "Quick Start" belong together → "Getting Started" category
- **Strong clustering** (75%): "Reports" and "Dashboards" belong together → "Analytics" category
- **Weak links** (<20%): "Getting Started" and "Analytics" are distinct categories
### Dendrogram (Hierarchical Clustering)
**What it is**: Tree diagram showing hierarchical relationships
**How to create**:
1. Start with each card as its own cluster
2. Iteratively merge closest clusters (highest similarity)
3. Continue until all cards in one cluster
**Interpreting dendrograms**:
- **Short branches**: High agreement (merge early)
- **Long branches**: Low agreement (merge late)
- **Clusters**: Cut tree at appropriate height to identify categories
**Example**:
```
All Cards
|
____________________+_____________________
| |
Getting Started Features
| |
____+____ _____+_____
| | | |
Sign Up First Login Analytics Settings
|
____+____
| |
Reports Dashboards
```
**Insight**: Users see clear distinction between "Getting Started" (onboarding tasks) and "Features" (ongoing use).
### Agreement Score (Consensus)
**What it is**: How much users agree on groupings
**Calculation methods**:
1. **Category agreement**: % of users who created similar category
- Example: 18/20 users (90%) created "Getting Started" category
2. **Pairwise agreement**: Average similarity across all card pairs
- Formula: Sum(all pairwise similarities) / Number of pairs
- High score (>70%) = strong consensus
- Low score (<50%) = weak consensus, need refinement
**When consensus is low**:
- Cards may be ambiguous (clarify labels)
- Users have different mental models (consider multiple navigation paths)
- Category is too broad (split into subcategories)
### Outlier Cards
**What they are**: Cards that don't fit anywhere consistently
**How to identify**: Low similarity with all other cards (<30% with any card)
**Common reasons**:
- Card label is unclear → Rewrite card
- Content doesn't belong in product → Remove
- Content is unique → Create standalone category or utility link
**Example**: "Billing" card — 15 users put it in "Settings", 3 in "Account", 2 didn't categorize it
- **Action**: Clarify if "Billing" is settings (configuration) or account (transactions)
---
## 2. Taxonomy Design Principles
### Mutually Exclusive, Collectively Exhaustive (MECE)
**Principle**: Categories don't overlap AND cover all content
**Mutually exclusive**: Each item belongs to exactly ONE category
- **Bad**: "Products" and "Best Sellers" (best sellers are also products — overlap)
- **Good**: "Products" (all) and "Featured" (separate facet or tag)
**Collectively exhaustive**: Every item has a category
- **Bad**: Categories: "Electronics", "Clothing" — but you also sell "Books" (gap)
- **Good**: Add "Books" OR create "Other" catch-all
**Testing MECE**:
1. List all content items
2. Try to categorize each
3. If item fits >1 category → not mutually exclusive
4. If item fits 0 categories → not collectively exhaustive
### Polyhierarchy vs. Faceted Classification
**Polyhierarchy**: Item can live in multiple places in hierarchy
- **Example**: "iPhone case" could be in:
- Electronics > Accessories > Phone Accessories
- Gifts > Under $50 > Tech Gifts
- **Pro**: Matches multiple user mental models
- **Con**: Confusing (where is "canonical" location?), hard to maintain
**Faceted classification**: Item has ONE location, multiple orthogonal attributes
- **Example**: "iPhone case" is in Electronics (primary category)
- Facet 1: Category = Electronics
- Facet 2: Price = Under $50
- Facet 3: Use Case = Gifts
- **Pro**: Clear, flexible filtering, scalable
- **Con**: Requires good facet design
**When to use each**:
- **Polyhierarchy**: Small content sets (<500 items), clear user need for multiple paths
- **Faceted**: Large content sets (>500 items), many attributes, users need flexible filtering
### Controlled Vocabulary vs. Folksonomy
**Controlled vocabulary**: Preset tags, curated by admins
- **Example**: "Authentication", "API", "Database" (exact tags, no variations)
- **Pro**: Consistency, findability, no duplication ("Auth" vs "Authentication")
- **Con**: Requires maintenance, may miss user terminology
**Folksonomy**: User-generated tags, anyone can create
- **Example**: Users tag articles with whatever terms they want
- **Pro**: Emergent, captures user language, low maintenance
- **Con**: Inconsistent, duplicates, noise ("Auth", "Authentication", "auth", "Authn")
**Hybrid approach** (recommended):
- Controlled vocabulary for core categories and facets
- Folksonomy for supplementary tags (with moderation)
- Periodically review folksonomy tags → promote common ones to controlled vocabulary
**Tag moderation**:
- Merge synonyms: "Auth" → "Authentication"
- Remove noise: "asdf", "test"
- Suggest tags: When user types "auth", suggest "Authentication"
### Category Size & Balance
**Guideline**: Aim for balanced category sizes (no one category dominates)
**Red flags**:
- **One huge category**: "Other" with 60% of items → need better taxonomy
- **Many tiny categories**: 20 categories, each with 2-5 items → over-categorization, consolidate
- **Unbalanced tree**: One branch 5 levels deep, others 2 levels → inconsistent complexity
**Target distribution**:
- Top-level categories: 5-9 categories
- Each category: Roughly equal # of items (within 2× of each other)
- If one category much larger: Split into subcategories
**Example**: E-commerce with 1000 products
- **Bad**: Electronics (600), Clothing (300), Books (80), Other (20)
- **Good**: Electronics (250), Clothing (250), Books (250), Home & Garden (250)
### Taxonomy Evolution
**Principle**: Taxonomies grow and change — design for evolution
**Strategies**:
1. **Leave room for growth**: Don't create 10 top-level categories if you'll need 15 next year
2. **Use "Other" temporarily**: New category emerging but not big enough yet? Use "Other" until critical mass
3. **Versioning**: Date taxonomy versions, track changes over time
4. **Deprecation**: Don't delete categories immediately — mark "deprecated", redirect users, then remove after transition period
**Example**: Software product adding ML features
- **Today**: 20 ML-related articles scattered across "Advanced", "API", "Tutorials"
- **Transition**: Create "Machine Learning" subcategory under "Advanced"
- **Future**: 100 ML articles → Promote "Machine Learning" to top-level category
---
## 3. Navigation Depth & Breadth Optimization
### Hick's Law & Choice Overload
**Hick's Law**: Decision time increases logarithmically with number of choices
**Formula**: Time = a + b × log₂(n + 1)
- More choices → longer time to decide
**Implications for IA**:
- **5-9 items per level**: Sweet spot (Miller's "7±2")
- **>12 items**: Users feel overwhelmed, scan inefficiently
- **<3 items**: Feels unnecessarily nested
**Example**:
- 100 items, flat (1 level, 100 choices): Overwhelming
- 100 items, 2 levels (10 × 10): Manageable
- 100 items, 4 levels (3 × 3 × 3 × 4): Too many clicks
**Optimal for 100 items**: 3 levels (5 × 5 × 4) or (7 × 7 × 2)
### The "3-Click Rule" Myth
**Myth**: Users abandon if content requires >3 clicks
**Reality**: Users tolerate clicks if:
1. **Progress is clear**: Breadcrumbs, page titles show "getting closer"
2. **Information scent is strong**: Each click brings them closer to goal (see Section 4)
3. **No dead ends**: Every click leads somewhere useful
**Research** (UIE study): Users successfully completed tasks requiring 5-12 clicks when navigation was clear
**Guideline**: Minimize clicks, but prioritize clarity over absolute number
- **Good**: 5 clear, purposeful clicks
- **Bad**: 2 clicks but confusing labels, users backtrack
### Breadth-First vs. Depth-First Navigation
**Breadth-first** (shallow, many top-level options):
- **Structure**: 10-15 top-level categories, 2-3 levels deep
- **Best for**: Browsing, exploration, users know general area but not exact item
- **Example**: News sites, e-commerce homepages
**Depth-first** (narrow, few top-level but deep):
- **Structure**: 3-5 top-level categories, 4-6 levels deep
- **Best for**: Specific lookup, expert users, hierarchical domains
- **Example**: Technical documentation, academic libraries
**Hybrid** (recommended for most):
- **Structure**: 5-7 top-level categories, 3-4 levels deep
- **Supplement with**: Search, filters, related links to "shortcut" across hierarchy
### Progressive Disclosure
**Principle**: Start simple, reveal complexity on-demand
**Techniques**:
1. **Hub-and-spoke**: Overview page → Detailed pages
- Hub: "Getting Started" with 5 clear entry points
- Spokes: Detailed guides linked from hub
2. **Accordion/Collapse**: Hide detail until user expands
- Navigation: Show categories, hide subcategories until expanded
- Content: Show summary, expand for full text
3. **Tiered navigation**: Primary nav (always visible) + secondary nav (contextual)
- Primary: "Products", "Support", "About"
- Secondary (when in "Products"): "Electronics", "Clothing", "Books"
4. **"More..." links**: Show top N items, hide rest until "Show more" clicked
- Navigation: Top 5 categories visible, "+3 more" link expands
**Anti-pattern**: Mega-menus showing everything at once (overwhelming)
---
## 4. Information Scent & Findability
### Information Scent
**Definition**: Cues that indicate whether a path will lead to desired information
**Strong scent**: Clear labels, descriptive headings, users click confidently
**Weak scent**: Vague labels, users guess, backtrack often
**Example**:
- **Weak scent**: "Solutions" → What's in there? (generic)
- **Strong scent**: "Developer API Documentation" → Clear what's inside
**Optimizing information scent**:
1. **Specific labels** (not generic):
- Bad: "Resources" → Too vague
- Good: "Code Samples", "Video Tutorials", "White Papers" → Specific
2. **Trigger words** (match user vocabulary):
- Card sort reveals users say "How do I..." → Label category "How-To Guides"
- Users search "pricing" → Ensure "Pricing" in nav, not "Plans" or "Subscription"
3. **Descriptive breadcrumbs**:
- Bad: "Home > Section 1 > Page 3" → No meaning
- Good: "Home > Developer Docs > API Reference" → Clear path
4. **Preview text**: Show snippet of content under link
- Navigation item: "API Reference" + "Complete list of endpoints and parameters"
### Findability Metrics
**Key metrics to track**:
1. **Time to find**: How long to locate content?
- **Target**: <30 sec for simple tasks, <2 min for complex
- **Measurement**: Task completion time in usability tests
2. **Success rate**: % of users who find content?
- **Target**: ≥70% (tree test), ≥80% (live site with search)
- **Measurement**: Tree test results, task success in usability tests
3. **Search vs. browse**: Do users search or navigate?
- **Good**: 40-60% browse, 40-60% search (both work)
- **Bad**: 90% search (navigation broken), 90% browse (search broken)
- **Measurement**: Analytics (search usage %, nav click-through)
4. **Search refinement rate**: % of searches that are refined?
- **Target**: <30% (users find on first search)
- **Bad**: >50% (users search, refine, search again → poor results)
- **Measurement**: Analytics (queries per session)
5. **Bounce rate by entry point**: % leaving immediately?
- **Target**: <40% for landing pages
- **Bad**: >60% (users don't find what they expected)
- **Measurement**: Analytics (bounce rate by page)
6. **Navigation abandonment**: % who start navigating, then leave?
- **Target**: <20%
- **Bad**: >40% (users get lost, give up)
- **Measurement**: Analytics (drop-off in navigation funnels)
### Search vs. Navigation Trade-offs
**When search is preferred**:
- Large content sets (>5000 items)
- Users know exactly what they want ("lookup" mode)
- Diverse content types (hard to categorize consistently)
**When navigation is preferred**:
- Smaller content sets (<500 items)
- Users browsing, exploring ("discovery" mode)
- Hierarchical domains (clear parent-child relationships)
**Best practice**: Offer BOTH
- Navigation for discovery, context, exploration
- Search for lookup, speed, known-item finding
**Optimizing search**:
- **Autocomplete**: Suggest as user types
- **Filters**: Narrow results by category, date, type
- **Best bets**: Featured results for common queries
- **Zero-results page**: Suggest alternatives, show popular content
**Optimizing navigation**:
- **Clear labels**: Match user vocabulary (card sort insights)
- **Faceted filters**: Browse + filter combination
- **Related links**: Help users discover adjacent content
- **Breadcrumbs**: Show path, enable backtracking
---
## 5. Advanced Topics
### Mental Models & User Research
**Mental model**: User's internal representation of how system works
**Why it matters**: Navigation should match user's mental model, not company's org chart
**Researching mental models**:
1. **Card sorting**: Reveals how users group/label content
2. **User interviews**: Ask "How would you organize this?" "What would you call this?"
3. **Tree testing**: Validates if proposed structure matches mental model
4. **First-click testing**: Where do users expect to find X?
**Common mismatches**:
- **Company thinks**: "Features" (technical view)
- **Users think**: "What can I do?" (task view)
- **Solution**: Rename to task-based labels ("Create Report", "Share Dashboard")
**Example**: SaaS product
- **Internal (wrong)**: "Modules" → "Synergistic Solutions" → "Widget Management"
- **User mental model (right)**: "Features" → "Reporting" → "Custom Reports"
### Cross-Cultural IA
**Challenge**: Different cultures have different categorization preferences
**Examples**:
- **Alphabetical**: Works for Latin scripts, not ideographic (Chinese, Japanese)
- **Color coding**: Red = danger (Western), Red = luck (Chinese)
- **Icons**: Mailbox icon = email (US), doesn't translate (many countries have different mailbox designs)
**Strategies**:
1. **Localization testing**: Card sort with target culture users
2. **Avoid culturally-specific metaphors**: "Home run", "touchdown" (US sports)
3. **Simple, universal labels**: "Home", "Search", "Help" (widely understood)
4. **Icons + text**: Don't rely on icons alone
### IA Governance
**Problem**: Taxonomy degrades over time without maintenance
**Governance framework**:
1. **Roles**:
- **Content owner**: Publishes content, assigns categories/tags
- **Taxonomy owner**: Maintains category structure, adds/removes categories
- **IA steward**: Monitors usage, recommends improvements
2. **Processes**:
- **Quarterly review**: Check taxonomy usage, identify issues
- **Change request**: How to propose new categories or restructure
- **Deprecation**: Process for removing outdated categories
- **Tag moderation**: Review user-generated tags, merge synonyms
3. **Metrics to monitor**:
- % content in "Other" or "Uncategorized" (should be <5%)
- Empty categories (no content) — remove or consolidate
- Oversized categories (>50% of content) — split into subcategories
4. **Tools**:
- CMS with taxonomy management
- Analytics to track usage
- Automated alerts (e.g., "Category X has no content")
### Personalization & Dynamic IA
**Concept**: Navigation adapts to user
**Approaches**:
1. **Audience-based**: Show different nav for different user types
- "For Developers", "For Marketers", "For Executives"
2. **History-based**: Prioritize recently visited or frequently used
- "Recently Viewed", "Your Favorites"
3. **Context-based**: Show nav relevant to current task
- "Related Articles", "Next Steps"
4. **Adaptive search**: Results ranked by user's past behavior
**Caution**: Don't over-personalize
- Users need consistency to build mental model
- Personalization should augment, not replace, standard navigation
### IA for Voice & AI Interfaces
**Challenge**: Traditional visual hierarchy doesn't work for voice
**Strategies**:
1. **Flat structure**: No deep nesting (can't show menu)
2. **Natural language categories**: "Where can I find information about X?" vs. "Navigate to Category > Subcategory"
3. **Conversational**: "What would you like to do?" vs. "Select option 1, 2, or 3"
4. **Context-aware**: Remember user's previous question, continue conversation
**Example**:
- **Web**: Home > Products > Electronics > Phones
- **Voice**: "Show me phones" → "Here are our top phone options..."
---
## Summary
**Card sorting** reveals user mental models through similarity matrices, dendrograms, and consensus scores. Outliers indicate unclear content.
**Taxonomy design** follows MECE principle (mutually exclusive, collectively exhaustive). Use faceted classification for scale, controlled vocabulary for consistency, and plan for evolution.
**Navigation optimization** balances breadth (many choices) vs. depth (many clicks). Optimal: 5-9 items per level, 3-4 levels deep. Progressive disclosure reduces initial complexity.
**Information scent** guides users with clear labels, trigger words, and descriptive breadcrumbs. Track findability metrics: time to find (<30 sec), success rate (≥70%), search vs. browse balance (40-60% each).
**Advanced techniques** include mental model research (card sort, interviews), cross-cultural adaptation, governance frameworks, personalization, and voice interface design.
**The goal**: Users can predict where information lives and find it quickly, regardless of access method.