Files

Zhongwei Li 41d9f6b189 Initial commit

2025-11-30 08:38:26 +08:00

19 KiB

Raw Blame History

Information Architecture: Advanced Methodology

This document covers advanced techniques for card sorting analysis, taxonomy design, navigation optimization, and findability improvement.

Card Sorting Analysis
Taxonomy Design Principles
Navigation Depth & Breadth Optimization
Information Scent & Findability
Advanced Topics

1. Card Sorting Analysis

Analyzing Card Sort Results

Goal: Extract meaningful patterns from user groupings

Similarity Matrix

What it is: Shows how often users grouped two cards together

How to calculate:

For each pair of cards, count how many users put them in the same group
Express as percentage: (# users who grouped together) / (total users)

Example:

	Sign Up	First Login	Quick Start	Reports	Dashboards
Sign Up	-	85%	90%	15%	10%
First Login	85%	-	88%	12%	8%
Quick Start	90%	88%	-	10%	12%
Reports	15%	12%	10%	-	75%
Dashboards	10%	8%	12%	75%	-

Interpretation:

Strong clustering (>70%): "Sign Up", "First Login", "Quick Start" belong together → "Getting Started" category
Strong clustering (75%): "Reports" and "Dashboards" belong together → "Analytics" category
Weak links (<20%): "Getting Started" and "Analytics" are distinct categories

Dendrogram (Hierarchical Clustering)

What it is: Tree diagram showing hierarchical relationships

How to create:

Start with each card as its own cluster
Iteratively merge closest clusters (highest similarity)
Continue until all cards in one cluster

Interpreting dendrograms:

Short branches: High agreement (merge early)
Long branches: Low agreement (merge late)
Clusters: Cut tree at appropriate height to identify categories

Example:

                        All Cards
                            |
        ____________________+_____________________
        |                                         |
    Getting Started                          Features
        |                                         |
    ____+____                              _____+_____
    |        |                            |           |
  Sign Up  First Login              Analytics    Settings
                                        |
                                    ____+____
                                    |        |
                                 Reports  Dashboards

Insight: Users see clear distinction between "Getting Started" (onboarding tasks) and "Features" (ongoing use).

Agreement Score (Consensus)

What it is: How much users agree on groupings

Calculation methods:

Category agreement: % of users who created similar category
- Example: 18/20 users (90%) created "Getting Started" category
Pairwise agreement: Average similarity across all card pairs
- Formula: Sum(all pairwise similarities) / Number of pairs
- High score (>70%) = strong consensus
- Low score (<50%) = weak consensus, need refinement

When consensus is low:

Cards may be ambiguous (clarify labels)
Users have different mental models (consider multiple navigation paths)
Category is too broad (split into subcategories)

Outlier Cards

What they are: Cards that don't fit anywhere consistently

How to identify: Low similarity with all other cards (<30% with any card)

Common reasons:

Card label is unclear → Rewrite card
Content doesn't belong in product → Remove
Content is unique → Create standalone category or utility link

Example: "Billing" card — 15 users put it in "Settings", 3 in "Account", 2 didn't categorize it

Action: Clarify if "Billing" is settings (configuration) or account (transactions)

2. Taxonomy Design Principles

Mutually Exclusive, Collectively Exhaustive (MECE)

Principle: Categories don't overlap AND cover all content

Mutually exclusive: Each item belongs to exactly ONE category

Bad: "Products" and "Best Sellers" (best sellers are also products — overlap)
Good: "Products" (all) and "Featured" (separate facet or tag)

Collectively exhaustive: Every item has a category

Bad: Categories: "Electronics", "Clothing" — but you also sell "Books" (gap)
Good: Add "Books" OR create "Other" catch-all

Testing MECE:

List all content items
Try to categorize each
If item fits >1 category → not mutually exclusive
If item fits 0 categories → not collectively exhaustive

Polyhierarchy vs. Faceted Classification

Polyhierarchy: Item can live in multiple places in hierarchy

Example: "iPhone case" could be in:
- Electronics > Accessories > Phone Accessories
- Gifts > Under $50 > Tech Gifts
Pro: Matches multiple user mental models
Con: Confusing (where is "canonical" location?), hard to maintain

Faceted classification: Item has ONE location, multiple orthogonal attributes

Example: "iPhone case" is in Electronics (primary category)
- Facet 1: Category = Electronics
- Facet 2: Price = Under $50
- Facet 3: Use Case = Gifts
Pro: Clear, flexible filtering, scalable
Con: Requires good facet design

When to use each:

Polyhierarchy: Small content sets (<500 items), clear user need for multiple paths
Faceted: Large content sets (>500 items), many attributes, users need flexible filtering

Controlled Vocabulary vs. Folksonomy

Controlled vocabulary: Preset tags, curated by admins

Example: "Authentication", "API", "Database" (exact tags, no variations)
Pro: Consistency, findability, no duplication ("Auth" vs "Authentication")
Con: Requires maintenance, may miss user terminology

Folksonomy: User-generated tags, anyone can create

Example: Users tag articles with whatever terms they want
Pro: Emergent, captures user language, low maintenance
Con: Inconsistent, duplicates, noise ("Auth", "Authentication", "auth", "Authn")

Hybrid approach (recommended):

Controlled vocabulary for core categories and facets
Folksonomy for supplementary tags (with moderation)
Periodically review folksonomy tags → promote common ones to controlled vocabulary

Tag moderation:

Merge synonyms: "Auth" → "Authentication"
Remove noise: "asdf", "test"
Suggest tags: When user types "auth", suggest "Authentication"

Category Size & Balance

Guideline: Aim for balanced category sizes (no one category dominates)

Red flags:

One huge category: "Other" with 60% of items → need better taxonomy
Many tiny categories: 20 categories, each with 2-5 items → over-categorization, consolidate
Unbalanced tree: One branch 5 levels deep, others 2 levels → inconsistent complexity

Target distribution:

Top-level categories: 5-9 categories
Each category: Roughly equal # of items (within 2× of each other)
If one category much larger: Split into subcategories

Example: E-commerce with 1000 products

Bad: Electronics (600), Clothing (300), Books (80), Other (20)
Good: Electronics (250), Clothing (250), Books (250), Home & Garden (250)

Taxonomy Evolution

Principle: Taxonomies grow and change — design for evolution

Strategies:

Leave room for growth: Don't create 10 top-level categories if you'll need 15 next year
Use "Other" temporarily: New category emerging but not big enough yet? Use "Other" until critical mass
Versioning: Date taxonomy versions, track changes over time
Deprecation: Don't delete categories immediately — mark "deprecated", redirect users, then remove after transition period

Example: Software product adding ML features

Today: 20 ML-related articles scattered across "Advanced", "API", "Tutorials"
Transition: Create "Machine Learning" subcategory under "Advanced"
Future: 100 ML articles → Promote "Machine Learning" to top-level category

Hick's Law & Choice Overload

Hick's Law: Decision time increases logarithmically with number of choices

Formula: Time = a + b × log₂(n + 1)

More choices → longer time to decide

Implications for IA:

5-9 items per level: Sweet spot (Miller's "7±2")
>12 items: Users feel overwhelmed, scan inefficiently
<3 items: Feels unnecessarily nested

Example:

100 items, flat (1 level, 100 choices): Overwhelming
100 items, 2 levels (10 × 10): Manageable
100 items, 4 levels (3 × 3 × 3 × 4): Too many clicks

Optimal for 100 items: 3 levels (5 × 5 × 4) or (7 × 7 × 2)

The "3-Click Rule" Myth

Myth: Users abandon if content requires >3 clicks

Reality: Users tolerate clicks if:

Progress is clear: Breadcrumbs, page titles show "getting closer"
Information scent is strong: Each click brings them closer to goal (see Section 4)
No dead ends: Every click leads somewhere useful

Research (UIE study): Users successfully completed tasks requiring 5-12 clicks when navigation was clear

Guideline: Minimize clicks, but prioritize clarity over absolute number

Good: 5 clear, purposeful clicks
Bad: 2 clicks but confusing labels, users backtrack

Breadth-first (shallow, many top-level options):

Structure: 10-15 top-level categories, 2-3 levels deep
Best for: Browsing, exploration, users know general area but not exact item
Example: News sites, e-commerce homepages

Depth-first (narrow, few top-level but deep):

Structure: 3-5 top-level categories, 4-6 levels deep
Best for: Specific lookup, expert users, hierarchical domains
Example: Technical documentation, academic libraries

Hybrid (recommended for most):

Structure: 5-7 top-level categories, 3-4 levels deep
Supplement with: Search, filters, related links to "shortcut" across hierarchy

Progressive Disclosure

Principle: Start simple, reveal complexity on-demand

Techniques:

Hub-and-spoke: Overview page → Detailed pages
- Hub: "Getting Started" with 5 clear entry points
- Spokes: Detailed guides linked from hub
Accordion/Collapse: Hide detail until user expands
- Navigation: Show categories, hide subcategories until expanded
- Content: Show summary, expand for full text
Tiered navigation: Primary nav (always visible) + secondary nav (contextual)
- Primary: "Products", "Support", "About"
- Secondary (when in "Products"): "Electronics", "Clothing", "Books"
"More..." links: Show top N items, hide rest until "Show more" clicked
- Navigation: Top 5 categories visible, "+3 more" link expands

Anti-pattern: Mega-menus showing everything at once (overwhelming)

4. Information Scent & Findability

Information Scent

Definition: Cues that indicate whether a path will lead to desired information

Strong scent: Clear labels, descriptive headings, users click confidently Weak scent: Vague labels, users guess, backtrack often

Example:

Weak scent: "Solutions" → What's in there? (generic)
Strong scent: "Developer API Documentation" → Clear what's inside

Optimizing information scent:

Specific labels (not generic):
- Bad: "Resources" → Too vague
- Good: "Code Samples", "Video Tutorials", "White Papers" → Specific
Trigger words (match user vocabulary):
- Card sort reveals users say "How do I..." → Label category "How-To Guides"
- Users search "pricing" → Ensure "Pricing" in nav, not "Plans" or "Subscription"
Descriptive breadcrumbs:
- Bad: "Home > Section 1 > Page 3" → No meaning
- Good: "Home > Developer Docs > API Reference" → Clear path
Preview text: Show snippet of content under link
- Navigation item: "API Reference" + "Complete list of endpoints and parameters"

Findability Metrics

Key metrics to track:

Time to find: How long to locate content?
- Target: <30 sec for simple tasks, <2 min for complex
- Measurement: Task completion time in usability tests
Success rate: % of users who find content?
- Target: ≥70% (tree test), ≥80% (live site with search)
- Measurement: Tree test results, task success in usability tests
Search vs. browse: Do users search or navigate?
- Good: 40-60% browse, 40-60% search (both work)
- Bad: 90% search (navigation broken), 90% browse (search broken)
- Measurement: Analytics (search usage %, nav click-through)
Search refinement rate: % of searches that are refined?
- Target: <30% (users find on first search)
- Bad: >50% (users search, refine, search again → poor results)
- Measurement: Analytics (queries per session)
Bounce rate by entry point: % leaving immediately?
- Target: <40% for landing pages
- Bad: >60% (users don't find what they expected)
- Measurement: Analytics (bounce rate by page)
Navigation abandonment: % who start navigating, then leave?
- Target: <20%
- Bad: >40% (users get lost, give up)
- Measurement: Analytics (drop-off in navigation funnels)

When search is preferred:

Large content sets (>5000 items)
Users know exactly what they want ("lookup" mode)
Diverse content types (hard to categorize consistently)

When navigation is preferred:

Smaller content sets (<500 items)
Users browsing, exploring ("discovery" mode)
Hierarchical domains (clear parent-child relationships)

Best practice: Offer BOTH

Navigation for discovery, context, exploration
Search for lookup, speed, known-item finding

Optimizing search:

Autocomplete: Suggest as user types
Filters: Narrow results by category, date, type
Best bets: Featured results for common queries
Zero-results page: Suggest alternatives, show popular content

Optimizing navigation:

Clear labels: Match user vocabulary (card sort insights)
Faceted filters: Browse + filter combination
Related links: Help users discover adjacent content
Breadcrumbs: Show path, enable backtracking

5. Advanced Topics

Mental Models & User Research

Mental model: User's internal representation of how system works

Why it matters: Navigation should match user's mental model, not company's org chart

Researching mental models:

Card sorting: Reveals how users group/label content
User interviews: Ask "How would you organize this?" "What would you call this?"
Tree testing: Validates if proposed structure matches mental model
First-click testing: Where do users expect to find X?

Common mismatches:

Company thinks: "Features" (technical view)
Users think: "What can I do?" (task view)
Solution: Rename to task-based labels ("Create Report", "Share Dashboard")

Example: SaaS product

Internal (wrong): "Modules" → "Synergistic Solutions" → "Widget Management"
User mental model (right): "Features" → "Reporting" → "Custom Reports"

Cross-Cultural IA

Challenge: Different cultures have different categorization preferences

Examples:

Alphabetical: Works for Latin scripts, not ideographic (Chinese, Japanese)
Color coding: Red = danger (Western), Red = luck (Chinese)
Icons: Mailbox icon = email (US), doesn't translate (many countries have different mailbox designs)

Strategies:

Localization testing: Card sort with target culture users
Avoid culturally-specific metaphors: "Home run", "touchdown" (US sports)
Simple, universal labels: "Home", "Search", "Help" (widely understood)
Icons + text: Don't rely on icons alone

IA Governance

Problem: Taxonomy degrades over time without maintenance

Governance framework:

Roles:
- Content owner: Publishes content, assigns categories/tags
- Taxonomy owner: Maintains category structure, adds/removes categories
- IA steward: Monitors usage, recommends improvements
Processes:
- Quarterly review: Check taxonomy usage, identify issues
- Change request: How to propose new categories or restructure
- Deprecation: Process for removing outdated categories
- Tag moderation: Review user-generated tags, merge synonyms
Metrics to monitor:
- % content in "Other" or "Uncategorized" (should be <5%)
- Empty categories (no content) — remove or consolidate
- Oversized categories (>50% of content) — split into subcategories
Tools:
- CMS with taxonomy management
- Analytics to track usage
- Automated alerts (e.g., "Category X has no content")

Personalization & Dynamic IA

Concept: Navigation adapts to user

Approaches:

Audience-based: Show different nav for different user types
- "For Developers", "For Marketers", "For Executives"
History-based: Prioritize recently visited or frequently used
- "Recently Viewed", "Your Favorites"
Context-based: Show nav relevant to current task
- "Related Articles", "Next Steps"
Adaptive search: Results ranked by user's past behavior

Caution: Don't over-personalize

Users need consistency to build mental model
Personalization should augment, not replace, standard navigation

IA for Voice & AI Interfaces

Challenge: Traditional visual hierarchy doesn't work for voice

Strategies:

Flat structure: No deep nesting (can't show menu)
Natural language categories: "Where can I find information about X?" vs. "Navigate to Category > Subcategory"
Conversational: "What would you like to do?" vs. "Select option 1, 2, or 3"
Context-aware: Remember user's previous question, continue conversation

Example:

Web: Home > Products > Electronics > Phones
Voice: "Show me phones" → "Here are our top phone options..."

Summary

Card sorting reveals user mental models through similarity matrices, dendrograms, and consensus scores. Outliers indicate unclear content.

Taxonomy design follows MECE principle (mutually exclusive, collectively exhaustive). Use faceted classification for scale, controlled vocabulary for consistency, and plan for evolution.

Navigation optimization balances breadth (many choices) vs. depth (many clicks). Optimal: 5-9 items per level, 3-4 levels deep. Progressive disclosure reduces initial complexity.

Information scent guides users with clear labels, trigger words, and descriptive breadcrumbs. Track findability metrics: time to find (<30 sec), success rate (≥70%), search vs. browse balance (40-60% each).

Advanced techniques include mental model research (card sort, interviews), cross-cultural adaptation, governance frameworks, personalization, and voice interface design.

The goal: Users can predict where information lives and find it quickly, regardless of access method.

19 KiB

Raw Blame History

Information Architecture: Advanced Methodology

Table of Contents

1. Card Sorting Analysis

Analyzing Card Sort Results

Similarity Matrix

Dendrogram (Hierarchical Clustering)

Agreement Score (Consensus)

Outlier Cards

2. Taxonomy Design Principles

Mutually Exclusive, Collectively Exhaustive (MECE)

Polyhierarchy vs. Faceted Classification

Controlled Vocabulary vs. Folksonomy

Category Size & Balance

Taxonomy Evolution

3. Navigation Depth & Breadth Optimization

Hick's Law & Choice Overload

The "3-Click Rule" Myth

Breadth-First vs. Depth-First Navigation

Progressive Disclosure

4. Information Scent & Findability

Information Scent

Findability Metrics

Search vs. Navigation Trade-offs

5. Advanced Topics

Mental Models & User Research

Cross-Cultural IA

IA Governance

Personalization & Dynamic IA

IA for Voice & AI Interfaces

Summary

19 KiB Raw Blame History Unescape Escape

Information Architecture: Advanced Methodology

Table of Contents

1. Card Sorting Analysis

Analyzing Card Sort Results

Similarity Matrix

Dendrogram (Hierarchical Clustering)

Agreement Score (Consensus)

Outlier Cards

2. Taxonomy Design Principles

Mutually Exclusive, Collectively Exhaustive (MECE)

Polyhierarchy vs. Faceted Classification

Controlled Vocabulary vs. Folksonomy

Category Size & Balance

Taxonomy Evolution

3. Navigation Depth & Breadth Optimization

Hick's Law & Choice Overload

The "3-Click Rule" Myth

Breadth-First vs. Depth-First Navigation

Progressive Disclosure

4. Information Scent & Findability

Information Scent

Findability Metrics

Search vs. Navigation Trade-offs

5. Advanced Topics

Mental Models & User Research

Cross-Cultural IA

IA Governance

Personalization & Dynamic IA

IA for Voice & AI Interfaces

Summary

19 KiB

Raw Blame History