Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:38:26 +08:00
commit 41d9f6b189
304 changed files with 98322 additions and 0 deletions

View File

@@ -0,0 +1,494 @@
# Information Architecture: Advanced Methodology
This document covers advanced techniques for card sorting analysis, taxonomy design, navigation optimization, and findability improvement.
## Table of Contents
1. [Card Sorting Analysis](#1-card-sorting-analysis)
2. [Taxonomy Design Principles](#2-taxonomy-design-principles)
3. [Navigation Depth & Breadth Optimization](#3-navigation-depth--breadth-optimization)
4. [Information Scent & Findability](#4-information-scent--findability)
5. [Advanced Topics](#5-advanced-topics)
---
## 1. Card Sorting Analysis
### Analyzing Card Sort Results
**Goal**: Extract meaningful patterns from user groupings
### Similarity Matrix
**What it is**: Shows how often users grouped two cards together
**How to calculate**:
- For each pair of cards, count how many users put them in the same group
- Express as percentage: (# users who grouped together) / (total users)
**Example**:
| | Sign Up | First Login | Quick Start | Reports | Dashboards |
|--|---------|-------------|-------------|---------|------------|
| Sign Up | - | 85% | 90% | 15% | 10% |
| First Login | 85% | - | 88% | 12% | 8% |
| Quick Start | 90% | 88% | - | 10% | 12% |
| Reports | 15% | 12% | 10% | - | 75% |
| Dashboards | 10% | 8% | 12% | 75% | - |
**Interpretation**:
- **Strong clustering** (>70%): "Sign Up", "First Login", "Quick Start" belong together → "Getting Started" category
- **Strong clustering** (75%): "Reports" and "Dashboards" belong together → "Analytics" category
- **Weak links** (<20%): "Getting Started" and "Analytics" are distinct categories
### Dendrogram (Hierarchical Clustering)
**What it is**: Tree diagram showing hierarchical relationships
**How to create**:
1. Start with each card as its own cluster
2. Iteratively merge closest clusters (highest similarity)
3. Continue until all cards in one cluster
**Interpreting dendrograms**:
- **Short branches**: High agreement (merge early)
- **Long branches**: Low agreement (merge late)
- **Clusters**: Cut tree at appropriate height to identify categories
**Example**:
```
All Cards
|
____________________+_____________________
| |
Getting Started Features
| |
____+____ _____+_____
| | | |
Sign Up First Login Analytics Settings
|
____+____
| |
Reports Dashboards
```
**Insight**: Users see clear distinction between "Getting Started" (onboarding tasks) and "Features" (ongoing use).
### Agreement Score (Consensus)
**What it is**: How much users agree on groupings
**Calculation methods**:
1. **Category agreement**: % of users who created similar category
- Example: 18/20 users (90%) created "Getting Started" category
2. **Pairwise agreement**: Average similarity across all card pairs
- Formula: Sum(all pairwise similarities) / Number of pairs
- High score (>70%) = strong consensus
- Low score (<50%) = weak consensus, need refinement
**When consensus is low**:
- Cards may be ambiguous (clarify labels)
- Users have different mental models (consider multiple navigation paths)
- Category is too broad (split into subcategories)
### Outlier Cards
**What they are**: Cards that don't fit anywhere consistently
**How to identify**: Low similarity with all other cards (<30% with any card)
**Common reasons**:
- Card label is unclear → Rewrite card
- Content doesn't belong in product → Remove
- Content is unique → Create standalone category or utility link
**Example**: "Billing" card — 15 users put it in "Settings", 3 in "Account", 2 didn't categorize it
- **Action**: Clarify if "Billing" is settings (configuration) or account (transactions)
---
## 2. Taxonomy Design Principles
### Mutually Exclusive, Collectively Exhaustive (MECE)
**Principle**: Categories don't overlap AND cover all content
**Mutually exclusive**: Each item belongs to exactly ONE category
- **Bad**: "Products" and "Best Sellers" (best sellers are also products — overlap)
- **Good**: "Products" (all) and "Featured" (separate facet or tag)
**Collectively exhaustive**: Every item has a category
- **Bad**: Categories: "Electronics", "Clothing" — but you also sell "Books" (gap)
- **Good**: Add "Books" OR create "Other" catch-all
**Testing MECE**:
1. List all content items
2. Try to categorize each
3. If item fits >1 category → not mutually exclusive
4. If item fits 0 categories → not collectively exhaustive
### Polyhierarchy vs. Faceted Classification
**Polyhierarchy**: Item can live in multiple places in hierarchy
- **Example**: "iPhone case" could be in:
- Electronics > Accessories > Phone Accessories
- Gifts > Under $50 > Tech Gifts
- **Pro**: Matches multiple user mental models
- **Con**: Confusing (where is "canonical" location?), hard to maintain
**Faceted classification**: Item has ONE location, multiple orthogonal attributes
- **Example**: "iPhone case" is in Electronics (primary category)
- Facet 1: Category = Electronics
- Facet 2: Price = Under $50
- Facet 3: Use Case = Gifts
- **Pro**: Clear, flexible filtering, scalable
- **Con**: Requires good facet design
**When to use each**:
- **Polyhierarchy**: Small content sets (<500 items), clear user need for multiple paths
- **Faceted**: Large content sets (>500 items), many attributes, users need flexible filtering
### Controlled Vocabulary vs. Folksonomy
**Controlled vocabulary**: Preset tags, curated by admins
- **Example**: "Authentication", "API", "Database" (exact tags, no variations)
- **Pro**: Consistency, findability, no duplication ("Auth" vs "Authentication")
- **Con**: Requires maintenance, may miss user terminology
**Folksonomy**: User-generated tags, anyone can create
- **Example**: Users tag articles with whatever terms they want
- **Pro**: Emergent, captures user language, low maintenance
- **Con**: Inconsistent, duplicates, noise ("Auth", "Authentication", "auth", "Authn")
**Hybrid approach** (recommended):
- Controlled vocabulary for core categories and facets
- Folksonomy for supplementary tags (with moderation)
- Periodically review folksonomy tags → promote common ones to controlled vocabulary
**Tag moderation**:
- Merge synonyms: "Auth" → "Authentication"
- Remove noise: "asdf", "test"
- Suggest tags: When user types "auth", suggest "Authentication"
### Category Size & Balance
**Guideline**: Aim for balanced category sizes (no one category dominates)
**Red flags**:
- **One huge category**: "Other" with 60% of items → need better taxonomy
- **Many tiny categories**: 20 categories, each with 2-5 items → over-categorization, consolidate
- **Unbalanced tree**: One branch 5 levels deep, others 2 levels → inconsistent complexity
**Target distribution**:
- Top-level categories: 5-9 categories
- Each category: Roughly equal # of items (within 2× of each other)
- If one category much larger: Split into subcategories
**Example**: E-commerce with 1000 products
- **Bad**: Electronics (600), Clothing (300), Books (80), Other (20)
- **Good**: Electronics (250), Clothing (250), Books (250), Home & Garden (250)
### Taxonomy Evolution
**Principle**: Taxonomies grow and change — design for evolution
**Strategies**:
1. **Leave room for growth**: Don't create 10 top-level categories if you'll need 15 next year
2. **Use "Other" temporarily**: New category emerging but not big enough yet? Use "Other" until critical mass
3. **Versioning**: Date taxonomy versions, track changes over time
4. **Deprecation**: Don't delete categories immediately — mark "deprecated", redirect users, then remove after transition period
**Example**: Software product adding ML features
- **Today**: 20 ML-related articles scattered across "Advanced", "API", "Tutorials"
- **Transition**: Create "Machine Learning" subcategory under "Advanced"
- **Future**: 100 ML articles → Promote "Machine Learning" to top-level category
---
## 3. Navigation Depth & Breadth Optimization
### Hick's Law & Choice Overload
**Hick's Law**: Decision time increases logarithmically with number of choices
**Formula**: Time = a + b × log₂(n + 1)
- More choices → longer time to decide
**Implications for IA**:
- **5-9 items per level**: Sweet spot (Miller's "7±2")
- **>12 items**: Users feel overwhelmed, scan inefficiently
- **<3 items**: Feels unnecessarily nested
**Example**:
- 100 items, flat (1 level, 100 choices): Overwhelming
- 100 items, 2 levels (10 × 10): Manageable
- 100 items, 4 levels (3 × 3 × 3 × 4): Too many clicks
**Optimal for 100 items**: 3 levels (5 × 5 × 4) or (7 × 7 × 2)
### The "3-Click Rule" Myth
**Myth**: Users abandon if content requires >3 clicks
**Reality**: Users tolerate clicks if:
1. **Progress is clear**: Breadcrumbs, page titles show "getting closer"
2. **Information scent is strong**: Each click brings them closer to goal (see Section 4)
3. **No dead ends**: Every click leads somewhere useful
**Research** (UIE study): Users successfully completed tasks requiring 5-12 clicks when navigation was clear
**Guideline**: Minimize clicks, but prioritize clarity over absolute number
- **Good**: 5 clear, purposeful clicks
- **Bad**: 2 clicks but confusing labels, users backtrack
### Breadth-First vs. Depth-First Navigation
**Breadth-first** (shallow, many top-level options):
- **Structure**: 10-15 top-level categories, 2-3 levels deep
- **Best for**: Browsing, exploration, users know general area but not exact item
- **Example**: News sites, e-commerce homepages
**Depth-first** (narrow, few top-level but deep):
- **Structure**: 3-5 top-level categories, 4-6 levels deep
- **Best for**: Specific lookup, expert users, hierarchical domains
- **Example**: Technical documentation, academic libraries
**Hybrid** (recommended for most):
- **Structure**: 5-7 top-level categories, 3-4 levels deep
- **Supplement with**: Search, filters, related links to "shortcut" across hierarchy
### Progressive Disclosure
**Principle**: Start simple, reveal complexity on-demand
**Techniques**:
1. **Hub-and-spoke**: Overview page → Detailed pages
- Hub: "Getting Started" with 5 clear entry points
- Spokes: Detailed guides linked from hub
2. **Accordion/Collapse**: Hide detail until user expands
- Navigation: Show categories, hide subcategories until expanded
- Content: Show summary, expand for full text
3. **Tiered navigation**: Primary nav (always visible) + secondary nav (contextual)
- Primary: "Products", "Support", "About"
- Secondary (when in "Products"): "Electronics", "Clothing", "Books"
4. **"More..." links**: Show top N items, hide rest until "Show more" clicked
- Navigation: Top 5 categories visible, "+3 more" link expands
**Anti-pattern**: Mega-menus showing everything at once (overwhelming)
---
## 4. Information Scent & Findability
### Information Scent
**Definition**: Cues that indicate whether a path will lead to desired information
**Strong scent**: Clear labels, descriptive headings, users click confidently
**Weak scent**: Vague labels, users guess, backtrack often
**Example**:
- **Weak scent**: "Solutions" → What's in there? (generic)
- **Strong scent**: "Developer API Documentation" → Clear what's inside
**Optimizing information scent**:
1. **Specific labels** (not generic):
- Bad: "Resources" → Too vague
- Good: "Code Samples", "Video Tutorials", "White Papers" → Specific
2. **Trigger words** (match user vocabulary):
- Card sort reveals users say "How do I..." → Label category "How-To Guides"
- Users search "pricing" → Ensure "Pricing" in nav, not "Plans" or "Subscription"
3. **Descriptive breadcrumbs**:
- Bad: "Home > Section 1 > Page 3" → No meaning
- Good: "Home > Developer Docs > API Reference" → Clear path
4. **Preview text**: Show snippet of content under link
- Navigation item: "API Reference" + "Complete list of endpoints and parameters"
### Findability Metrics
**Key metrics to track**:
1. **Time to find**: How long to locate content?
- **Target**: <30 sec for simple tasks, <2 min for complex
- **Measurement**: Task completion time in usability tests
2. **Success rate**: % of users who find content?
- **Target**: ≥70% (tree test), ≥80% (live site with search)
- **Measurement**: Tree test results, task success in usability tests
3. **Search vs. browse**: Do users search or navigate?
- **Good**: 40-60% browse, 40-60% search (both work)
- **Bad**: 90% search (navigation broken), 90% browse (search broken)
- **Measurement**: Analytics (search usage %, nav click-through)
4. **Search refinement rate**: % of searches that are refined?
- **Target**: <30% (users find on first search)
- **Bad**: >50% (users search, refine, search again → poor results)
- **Measurement**: Analytics (queries per session)
5. **Bounce rate by entry point**: % leaving immediately?
- **Target**: <40% for landing pages
- **Bad**: >60% (users don't find what they expected)
- **Measurement**: Analytics (bounce rate by page)
6. **Navigation abandonment**: % who start navigating, then leave?
- **Target**: <20%
- **Bad**: >40% (users get lost, give up)
- **Measurement**: Analytics (drop-off in navigation funnels)
### Search vs. Navigation Trade-offs
**When search is preferred**:
- Large content sets (>5000 items)
- Users know exactly what they want ("lookup" mode)
- Diverse content types (hard to categorize consistently)
**When navigation is preferred**:
- Smaller content sets (<500 items)
- Users browsing, exploring ("discovery" mode)
- Hierarchical domains (clear parent-child relationships)
**Best practice**: Offer BOTH
- Navigation for discovery, context, exploration
- Search for lookup, speed, known-item finding
**Optimizing search**:
- **Autocomplete**: Suggest as user types
- **Filters**: Narrow results by category, date, type
- **Best bets**: Featured results for common queries
- **Zero-results page**: Suggest alternatives, show popular content
**Optimizing navigation**:
- **Clear labels**: Match user vocabulary (card sort insights)
- **Faceted filters**: Browse + filter combination
- **Related links**: Help users discover adjacent content
- **Breadcrumbs**: Show path, enable backtracking
---
## 5. Advanced Topics
### Mental Models & User Research
**Mental model**: User's internal representation of how system works
**Why it matters**: Navigation should match user's mental model, not company's org chart
**Researching mental models**:
1. **Card sorting**: Reveals how users group/label content
2. **User interviews**: Ask "How would you organize this?" "What would you call this?"
3. **Tree testing**: Validates if proposed structure matches mental model
4. **First-click testing**: Where do users expect to find X?
**Common mismatches**:
- **Company thinks**: "Features" (technical view)
- **Users think**: "What can I do?" (task view)
- **Solution**: Rename to task-based labels ("Create Report", "Share Dashboard")
**Example**: SaaS product
- **Internal (wrong)**: "Modules" → "Synergistic Solutions" → "Widget Management"
- **User mental model (right)**: "Features" → "Reporting" → "Custom Reports"
### Cross-Cultural IA
**Challenge**: Different cultures have different categorization preferences
**Examples**:
- **Alphabetical**: Works for Latin scripts, not ideographic (Chinese, Japanese)
- **Color coding**: Red = danger (Western), Red = luck (Chinese)
- **Icons**: Mailbox icon = email (US), doesn't translate (many countries have different mailbox designs)
**Strategies**:
1. **Localization testing**: Card sort with target culture users
2. **Avoid culturally-specific metaphors**: "Home run", "touchdown" (US sports)
3. **Simple, universal labels**: "Home", "Search", "Help" (widely understood)
4. **Icons + text**: Don't rely on icons alone
### IA Governance
**Problem**: Taxonomy degrades over time without maintenance
**Governance framework**:
1. **Roles**:
- **Content owner**: Publishes content, assigns categories/tags
- **Taxonomy owner**: Maintains category structure, adds/removes categories
- **IA steward**: Monitors usage, recommends improvements
2. **Processes**:
- **Quarterly review**: Check taxonomy usage, identify issues
- **Change request**: How to propose new categories or restructure
- **Deprecation**: Process for removing outdated categories
- **Tag moderation**: Review user-generated tags, merge synonyms
3. **Metrics to monitor**:
- % content in "Other" or "Uncategorized" (should be <5%)
- Empty categories (no content) — remove or consolidate
- Oversized categories (>50% of content) — split into subcategories
4. **Tools**:
- CMS with taxonomy management
- Analytics to track usage
- Automated alerts (e.g., "Category X has no content")
### Personalization & Dynamic IA
**Concept**: Navigation adapts to user
**Approaches**:
1. **Audience-based**: Show different nav for different user types
- "For Developers", "For Marketers", "For Executives"
2. **History-based**: Prioritize recently visited or frequently used
- "Recently Viewed", "Your Favorites"
3. **Context-based**: Show nav relevant to current task
- "Related Articles", "Next Steps"
4. **Adaptive search**: Results ranked by user's past behavior
**Caution**: Don't over-personalize
- Users need consistency to build mental model
- Personalization should augment, not replace, standard navigation
### IA for Voice & AI Interfaces
**Challenge**: Traditional visual hierarchy doesn't work for voice
**Strategies**:
1. **Flat structure**: No deep nesting (can't show menu)
2. **Natural language categories**: "Where can I find information about X?" vs. "Navigate to Category > Subcategory"
3. **Conversational**: "What would you like to do?" vs. "Select option 1, 2, or 3"
4. **Context-aware**: Remember user's previous question, continue conversation
**Example**:
- **Web**: Home > Products > Electronics > Phones
- **Voice**: "Show me phones" → "Here are our top phone options..."
---
## Summary
**Card sorting** reveals user mental models through similarity matrices, dendrograms, and consensus scores. Outliers indicate unclear content.
**Taxonomy design** follows MECE principle (mutually exclusive, collectively exhaustive). Use faceted classification for scale, controlled vocabulary for consistency, and plan for evolution.
**Navigation optimization** balances breadth (many choices) vs. depth (many clicks). Optimal: 5-9 items per level, 3-4 levels deep. Progressive disclosure reduces initial complexity.
**Information scent** guides users with clear labels, trigger words, and descriptive breadcrumbs. Track findability metrics: time to find (<30 sec), success rate (≥70%), search vs. browse balance (40-60% each).
**Advanced techniques** include mental model research (card sort, interviews), cross-cultural adaptation, governance frameworks, personalization, and voice interface design.
**The goal**: Users can predict where information lives and find it quickly, regardless of access method.