Initial commit

2025-11-30 08:38:26 +08:00
commit 41d9f6b189
304 changed files with 98322 additions and 0 deletions
--- a/skills/information-architecture/resources/methodology.md
+++ b/skills/information-architecture/resources/methodology.md
@@ -0,0 +1,494 @@
+# Information Architecture: Advanced Methodology
+
+This document covers advanced techniques for card sorting analysis, taxonomy design, navigation optimization, and findability improvement.
+
+## Table of Contents
+1. [Card Sorting Analysis](#1-card-sorting-analysis)
+2. [Taxonomy Design Principles](#2-taxonomy-design-principles)
+3. [Navigation Depth & Breadth Optimization](#3-navigation-depth--breadth-optimization)
+4. [Information Scent & Findability](#4-information-scent--findability)
+5. [Advanced Topics](#5-advanced-topics)
+
+---
+
+## 1. Card Sorting Analysis
+
+### Analyzing Card Sort Results
+
+**Goal**: Extract meaningful patterns from user groupings
+
+### Similarity Matrix
+
+**What it is**: Shows how often users grouped two cards together
+
+**How to calculate**:
+- For each pair of cards, count how many users put them in the same group
+- Express as percentage: (# users who grouped together) / (total users)
+
+**Example**:
+
+|  | Sign Up | First Login | Quick Start | Reports | Dashboards |
+|--|---------|-------------|-------------|---------|------------|
+| Sign Up | - | 85% | 90% | 15% | 10% |
+| First Login | 85% | - | 88% | 12% | 8% |
+| Quick Start | 90% | 88% | - | 10% | 12% |
+| Reports | 15% | 12% | 10% | - | 75% |
+| Dashboards | 10% | 8% | 12% | 75% | - |
+
+**Interpretation**:
+- **Strong clustering** (>70%): "Sign Up", "First Login", "Quick Start" belong together → "Getting Started" category
+- **Strong clustering** (75%): "Reports" and "Dashboards" belong together → "Analytics" category
+- **Weak links** (<20%): "Getting Started" and "Analytics" are distinct categories
+
+### Dendrogram (Hierarchical Clustering)
+
+**What it is**: Tree diagram showing hierarchical relationships
+
+**How to create**:
+1. Start with each card as its own cluster
+2. Iteratively merge closest clusters (highest similarity)
+3. Continue until all cards in one cluster
+
+**Interpreting dendrograms**:
+- **Short branches**: High agreement (merge early)
+- **Long branches**: Low agreement (merge late)
+- **Clusters**: Cut tree at appropriate height to identify categories
+
+**Example**:
+```
+                        All Cards
+                            |
+        ____________________+_____________________
+        |                                         |
+    Getting Started                          Features
+        |                                         |
+    ____+____                              _____+_____
+    |        |                            |           |
+  Sign Up  First Login              Analytics    Settings
+                                        |
+                                    ____+____
+                                    |        |
+                                 Reports  Dashboards
+```
+
+**Insight**: Users see clear distinction between "Getting Started" (onboarding tasks) and "Features" (ongoing use).
+
+### Agreement Score (Consensus)
+
+**What it is**: How much users agree on groupings
+
+**Calculation methods**:
+
+1. **Category agreement**: % of users who created similar category
+   - Example: 18/20 users (90%) created "Getting Started" category
+
+2. **Pairwise agreement**: Average similarity across all card pairs
+   - Formula: Sum(all pairwise similarities) / Number of pairs
+   - High score (>70%) = strong consensus
+   - Low score (<50%) = weak consensus, need refinement
+
+**When consensus is low**:
+- Cards may be ambiguous (clarify labels)
+- Users have different mental models (consider multiple navigation paths)
+- Category is too broad (split into subcategories)
+
+### Outlier Cards
+
+**What they are**: Cards that don't fit anywhere consistently
+
+**How to identify**: Low similarity with all other cards (<30% with any card)
+
+**Common reasons**:
+- Card label is unclear → Rewrite card
+- Content doesn't belong in product → Remove
+- Content is unique → Create standalone category or utility link
+
+**Example**: "Billing" card — 15 users put it in "Settings", 3 in "Account", 2 didn't categorize it
+- **Action**: Clarify if "Billing" is settings (configuration) or account (transactions)
+
+---
+
+## 2. Taxonomy Design Principles
+
+### Mutually Exclusive, Collectively Exhaustive (MECE)
+
+**Principle**: Categories don't overlap AND cover all content
+
+**Mutually exclusive**: Each item belongs to exactly ONE category
+- **Bad**: "Products" and "Best Sellers" (best sellers are also products — overlap)
+- **Good**: "Products" (all) and "Featured" (separate facet or tag)
+
+**Collectively exhaustive**: Every item has a category
+- **Bad**: Categories: "Electronics", "Clothing" — but you also sell "Books" (gap)
+- **Good**: Add "Books" OR create "Other" catch-all
+
+**Testing MECE**:
+1. List all content items
+2. Try to categorize each
+3. If item fits >1 category → not mutually exclusive
+4. If item fits 0 categories → not collectively exhaustive
+
+### Polyhierarchy vs. Faceted Classification
+
+**Polyhierarchy**: Item can live in multiple places in hierarchy
+- **Example**: "iPhone case" could be in:
+  - Electronics > Accessories > Phone Accessories
+  - Gifts > Under $50 > Tech Gifts
+- **Pro**: Matches multiple user mental models
+- **Con**: Confusing (where is "canonical" location?), hard to maintain
+
+**Faceted classification**: Item has ONE location, multiple orthogonal attributes
+- **Example**: "iPhone case" is in Electronics (primary category)
+  - Facet 1: Category = Electronics
+  - Facet 2: Price = Under $50
+  - Facet 3: Use Case = Gifts
+- **Pro**: Clear, flexible filtering, scalable
+- **Con**: Requires good facet design
+
+**When to use each**:
+- **Polyhierarchy**: Small content sets (<500 items), clear user need for multiple paths
+- **Faceted**: Large content sets (>500 items), many attributes, users need flexible filtering
+
+### Controlled Vocabulary vs. Folksonomy
+
+**Controlled vocabulary**: Preset tags, curated by admins
+- **Example**: "Authentication", "API", "Database" (exact tags, no variations)
+- **Pro**: Consistency, findability, no duplication ("Auth" vs "Authentication")
+- **Con**: Requires maintenance, may miss user terminology
+
+**Folksonomy**: User-generated tags, anyone can create
+- **Example**: Users tag articles with whatever terms they want
+- **Pro**: Emergent, captures user language, low maintenance
+- **Con**: Inconsistent, duplicates, noise ("Auth", "Authentication", "auth", "Authn")
+
+**Hybrid approach** (recommended):
+- Controlled vocabulary for core categories and facets
+- Folksonomy for supplementary tags (with moderation)
+- Periodically review folksonomy tags → promote common ones to controlled vocabulary
+
+**Tag moderation**:
+- Merge synonyms: "Auth" → "Authentication"
+- Remove noise: "asdf", "test"
+- Suggest tags: When user types "auth", suggest "Authentication"
+
+### Category Size & Balance
+
+**Guideline**: Aim for balanced category sizes (no one category dominates)
+
+**Red flags**:
+- **One huge category**: "Other" with 60% of items → need better taxonomy
+- **Many tiny categories**: 20 categories, each with 2-5 items → over-categorization, consolidate
+- **Unbalanced tree**: One branch 5 levels deep, others 2 levels → inconsistent complexity
+
+**Target distribution**:
+- Top-level categories: 5-9 categories
+- Each category: Roughly equal # of items (within 2× of each other)
+- If one category much larger: Split into subcategories
+
+**Example**: E-commerce with 1000 products
+- **Bad**: Electronics (600), Clothing (300), Books (80), Other (20)
+- **Good**: Electronics (250), Clothing (250), Books (250), Home & Garden (250)
+
+### Taxonomy Evolution
+
+**Principle**: Taxonomies grow and change — design for evolution
+
+**Strategies**:
+1. **Leave room for growth**: Don't create 10 top-level categories if you'll need 15 next year
+2. **Use "Other" temporarily**: New category emerging but not big enough yet? Use "Other" until critical mass
+3. **Versioning**: Date taxonomy versions, track changes over time
+4. **Deprecation**: Don't delete categories immediately — mark "deprecated", redirect users, then remove after transition period
+
+**Example**: Software product adding ML features
+- **Today**: 20 ML-related articles scattered across "Advanced", "API", "Tutorials"
+- **Transition**: Create "Machine Learning" subcategory under "Advanced"
+- **Future**: 100 ML articles → Promote "Machine Learning" to top-level category
+
+---
+
+## 3. Navigation Depth & Breadth Optimization
+
+### Hick's Law & Choice Overload
+
+**Hick's Law**: Decision time increases logarithmically with number of choices
+
+**Formula**: Time = a + b × log₂(n + 1)
+- More choices → longer time to decide
+
+**Implications for IA**:
+- **5-9 items per level**: Sweet spot (Miller's "7±2")
+- **>12 items**: Users feel overwhelmed, scan inefficiently
+- **<3 items**: Feels unnecessarily nested
+
+**Example**:
+- 100 items, flat (1 level, 100 choices): Overwhelming
+- 100 items, 2 levels (10 × 10): Manageable
+- 100 items, 4 levels (3 × 3 × 3 × 4): Too many clicks
+
+**Optimal for 100 items**: 3 levels (5 × 5 × 4) or (7 × 7 × 2)
+
+### The "3-Click Rule" Myth
+
+**Myth**: Users abandon if content requires >3 clicks
+
+**Reality**: Users tolerate clicks if:
+1. **Progress is clear**: Breadcrumbs, page titles show "getting closer"
+2. **Information scent is strong**: Each click brings them closer to goal (see Section 4)
+3. **No dead ends**: Every click leads somewhere useful
+
+**Research** (UIE study): Users successfully completed tasks requiring 5-12 clicks when navigation was clear
+
+**Guideline**: Minimize clicks, but prioritize clarity over absolute number
+- **Good**: 5 clear, purposeful clicks
+- **Bad**: 2 clicks but confusing labels, users backtrack
+
+### Breadth-First vs. Depth-First Navigation
+
+**Breadth-first** (shallow, many top-level options):
+- **Structure**: 10-15 top-level categories, 2-3 levels deep
+- **Best for**: Browsing, exploration, users know general area but not exact item
+- **Example**: News sites, e-commerce homepages
+
+**Depth-first** (narrow, few top-level but deep):
+- **Structure**: 3-5 top-level categories, 4-6 levels deep
+- **Best for**: Specific lookup, expert users, hierarchical domains
+- **Example**: Technical documentation, academic libraries
+
+**Hybrid** (recommended for most):
+- **Structure**: 5-7 top-level categories, 3-4 levels deep
+- **Supplement with**: Search, filters, related links to "shortcut" across hierarchy
+
+### Progressive Disclosure
+
+**Principle**: Start simple, reveal complexity on-demand
+
+**Techniques**:
+
+1. **Hub-and-spoke**: Overview page → Detailed pages
+   - Hub: "Getting Started" with 5 clear entry points
+   - Spokes: Detailed guides linked from hub
+
+2. **Accordion/Collapse**: Hide detail until user expands
+   - Navigation: Show categories, hide subcategories until expanded
+   - Content: Show summary, expand for full text
+
+3. **Tiered navigation**: Primary nav (always visible) + secondary nav (contextual)
+   - Primary: "Products", "Support", "About"
+   - Secondary (when in "Products"): "Electronics", "Clothing", "Books"
+
+4. **"More..." links**: Show top N items, hide rest until "Show more" clicked
+   - Navigation: Top 5 categories visible, "+3 more" link expands
+
+**Anti-pattern**: Mega-menus showing everything at once (overwhelming)
+
+---
+
+## 4. Information Scent & Findability
+
+### Information Scent
+
+**Definition**: Cues that indicate whether a path will lead to desired information
+
+**Strong scent**: Clear labels, descriptive headings, users click confidently
+**Weak scent**: Vague labels, users guess, backtrack often
+
+**Example**:
+- **Weak scent**: "Solutions" → What's in there? (generic)
+- **Strong scent**: "Developer API Documentation" → Clear what's inside
+
+**Optimizing information scent**:
+
+1. **Specific labels** (not generic):
+   - Bad: "Resources" → Too vague
+   - Good: "Code Samples", "Video Tutorials", "White Papers" → Specific
+
+2. **Trigger words** (match user vocabulary):
+   - Card sort reveals users say "How do I..." → Label category "How-To Guides"
+   - Users search "pricing" → Ensure "Pricing" in nav, not "Plans" or "Subscription"
+
+3. **Descriptive breadcrumbs**:
+   - Bad: "Home > Section 1 > Page 3" → No meaning
+   - Good: "Home > Developer Docs > API Reference" → Clear path
+
+4. **Preview text**: Show snippet of content under link
+   - Navigation item: "API Reference" + "Complete list of endpoints and parameters"
+
+### Findability Metrics
+
+**Key metrics to track**:
+
+1. **Time to find**: How long to locate content?
+   - **Target**: <30 sec for simple tasks, <2 min for complex
+   - **Measurement**: Task completion time in usability tests
+
+2. **Success rate**: % of users who find content?
+   - **Target**: ≥70% (tree test), ≥80% (live site with search)
+   - **Measurement**: Tree test results, task success in usability tests
+
+3. **Search vs. browse**: Do users search or navigate?
+   - **Good**: 40-60% browse, 40-60% search (both work)
+   - **Bad**: 90% search (navigation broken), 90% browse (search broken)
+   - **Measurement**: Analytics (search usage %, nav click-through)
+
+4. **Search refinement rate**: % of searches that are refined?
+   - **Target**: <30% (users find on first search)
+   - **Bad**: >50% (users search, refine, search again → poor results)
+   - **Measurement**: Analytics (queries per session)
+
+5. **Bounce rate by entry point**: % leaving immediately?
+   - **Target**: <40% for landing pages
+   - **Bad**: >60% (users don't find what they expected)
+   - **Measurement**: Analytics (bounce rate by page)
+
+6. **Navigation abandonment**: % who start navigating, then leave?
+   - **Target**: <20%
+   - **Bad**: >40% (users get lost, give up)
+   - **Measurement**: Analytics (drop-off in navigation funnels)
+
+### Search vs. Navigation Trade-offs
+
+**When search is preferred**:
+- Large content sets (>5000 items)
+- Users know exactly what they want ("lookup" mode)
+- Diverse content types (hard to categorize consistently)
+
+**When navigation is preferred**:
+- Smaller content sets (<500 items)
+- Users browsing, exploring ("discovery" mode)
+- Hierarchical domains (clear parent-child relationships)
+
+**Best practice**: Offer BOTH
+- Navigation for discovery, context, exploration
+- Search for lookup, speed, known-item finding
+
+**Optimizing search**:
+- **Autocomplete**: Suggest as user types
+- **Filters**: Narrow results by category, date, type
+- **Best bets**: Featured results for common queries
+- **Zero-results page**: Suggest alternatives, show popular content
+
+**Optimizing navigation**:
+- **Clear labels**: Match user vocabulary (card sort insights)
+- **Faceted filters**: Browse + filter combination
+- **Related links**: Help users discover adjacent content
+- **Breadcrumbs**: Show path, enable backtracking
+
+---
+
+## 5. Advanced Topics
+
+### Mental Models & User Research
+
+**Mental model**: User's internal representation of how system works
+
+**Why it matters**: Navigation should match user's mental model, not company's org chart
+
+**Researching mental models**:
+
+1. **Card sorting**: Reveals how users group/label content
+2. **User interviews**: Ask "How would you organize this?" "What would you call this?"
+3. **Tree testing**: Validates if proposed structure matches mental model
+4. **First-click testing**: Where do users expect to find X?
+
+**Common mismatches**:
+- **Company thinks**: "Features" (technical view)
+- **Users think**: "What can I do?" (task view)
+- **Solution**: Rename to task-based labels ("Create Report", "Share Dashboard")
+
+**Example**: SaaS product
+- **Internal (wrong)**: "Modules" → "Synergistic Solutions" → "Widget Management"
+- **User mental model (right)**: "Features" → "Reporting" → "Custom Reports"
+
+### Cross-Cultural IA
+
+**Challenge**: Different cultures have different categorization preferences
+
+**Examples**:
+- **Alphabetical**: Works for Latin scripts, not ideographic (Chinese, Japanese)
+- **Color coding**: Red = danger (Western), Red = luck (Chinese)
+- **Icons**: Mailbox icon = email (US), doesn't translate (many countries have different mailbox designs)
+
+**Strategies**:
+1. **Localization testing**: Card sort with target culture users
+2. **Avoid culturally-specific metaphors**: "Home run", "touchdown" (US sports)
+3. **Simple, universal labels**: "Home", "Search", "Help" (widely understood)
+4. **Icons + text**: Don't rely on icons alone
+
+### IA Governance
+
+**Problem**: Taxonomy degrades over time without maintenance
+
+**Governance framework**:
+
+1. **Roles**:
+   - **Content owner**: Publishes content, assigns categories/tags
+   - **Taxonomy owner**: Maintains category structure, adds/removes categories
+   - **IA steward**: Monitors usage, recommends improvements
+
+2. **Processes**:
+   - **Quarterly review**: Check taxonomy usage, identify issues
+   - **Change request**: How to propose new categories or restructure
+   - **Deprecation**: Process for removing outdated categories
+   - **Tag moderation**: Review user-generated tags, merge synonyms
+
+3. **Metrics to monitor**:
+   - % content in "Other" or "Uncategorized" (should be <5%)
+   - Empty categories (no content) — remove or consolidate
+   - Oversized categories (>50% of content) — split into subcategories
+
+4. **Tools**:
+   - CMS with taxonomy management
+   - Analytics to track usage
+   - Automated alerts (e.g., "Category X has no content")
+
+### Personalization & Dynamic IA
+
+**Concept**: Navigation adapts to user
+
+**Approaches**:
+
+1. **Audience-based**: Show different nav for different user types
+   - "For Developers", "For Marketers", "For Executives"
+
+2. **History-based**: Prioritize recently visited or frequently used
+   - "Recently Viewed", "Your Favorites"
+
+3. **Context-based**: Show nav relevant to current task
+   - "Related Articles", "Next Steps"
+
+4. **Adaptive search**: Results ranked by user's past behavior
+
+**Caution**: Don't over-personalize
+- Users need consistency to build mental model
+- Personalization should augment, not replace, standard navigation
+
+### IA for Voice & AI Interfaces
+
+**Challenge**: Traditional visual hierarchy doesn't work for voice
+
+**Strategies**:
+
+1. **Flat structure**: No deep nesting (can't show menu)
+2. **Natural language categories**: "Where can I find information about X?" vs. "Navigate to Category > Subcategory"
+3. **Conversational**: "What would you like to do?" vs. "Select option 1, 2, or 3"
+4. **Context-aware**: Remember user's previous question, continue conversation
+
+**Example**:
+- **Web**: Home > Products > Electronics > Phones
+- **Voice**: "Show me phones" → "Here are our top phone options..."
+
+---
+
+## Summary
+
+**Card sorting** reveals user mental models through similarity matrices, dendrograms, and consensus scores. Outliers indicate unclear content.
+
+**Taxonomy design** follows MECE principle (mutually exclusive, collectively exhaustive). Use faceted classification for scale, controlled vocabulary for consistency, and plan for evolution.
+
+**Navigation optimization** balances breadth (many choices) vs. depth (many clicks). Optimal: 5-9 items per level, 3-4 levels deep. Progressive disclosure reduces initial complexity.
+
+**Information scent** guides users with clear labels, trigger words, and descriptive breadcrumbs. Track findability metrics: time to find (<30 sec), success rate (≥70%), search vs. browse balance (40-60% each).
+
+**Advanced techniques** include mental model research (card sort, interviews), cross-cultural adaptation, governance frameworks, personalization, and voice interface design.
+
+**The goal**: Users can predict where information lives and find it quickly, regardless of access method.