19 KiB
Information Architecture: Advanced Methodology
This document covers advanced techniques for card sorting analysis, taxonomy design, navigation optimization, and findability improvement.
Table of Contents
- Card Sorting Analysis
- Taxonomy Design Principles
- Navigation Depth & Breadth Optimization
- Information Scent & Findability
- Advanced Topics
1. Card Sorting Analysis
Analyzing Card Sort Results
Goal: Extract meaningful patterns from user groupings
Similarity Matrix
What it is: Shows how often users grouped two cards together
How to calculate:
- For each pair of cards, count how many users put them in the same group
- Express as percentage: (# users who grouped together) / (total users)
Example:
| Sign Up | First Login | Quick Start | Reports | Dashboards | |
|---|---|---|---|---|---|
| Sign Up | - | 85% | 90% | 15% | 10% |
| First Login | 85% | - | 88% | 12% | 8% |
| Quick Start | 90% | 88% | - | 10% | 12% |
| Reports | 15% | 12% | 10% | - | 75% |
| Dashboards | 10% | 8% | 12% | 75% | - |
Interpretation:
- Strong clustering (>70%): "Sign Up", "First Login", "Quick Start" belong together → "Getting Started" category
- Strong clustering (75%): "Reports" and "Dashboards" belong together → "Analytics" category
- Weak links (<20%): "Getting Started" and "Analytics" are distinct categories
Dendrogram (Hierarchical Clustering)
What it is: Tree diagram showing hierarchical relationships
How to create:
- Start with each card as its own cluster
- Iteratively merge closest clusters (highest similarity)
- Continue until all cards in one cluster
Interpreting dendrograms:
- Short branches: High agreement (merge early)
- Long branches: Low agreement (merge late)
- Clusters: Cut tree at appropriate height to identify categories
Example:
All Cards
|
____________________+_____________________
| |
Getting Started Features
| |
____+____ _____+_____
| | | |
Sign Up First Login Analytics Settings
|
____+____
| |
Reports Dashboards
Insight: Users see clear distinction between "Getting Started" (onboarding tasks) and "Features" (ongoing use).
Agreement Score (Consensus)
What it is: How much users agree on groupings
Calculation methods:
-
Category agreement: % of users who created similar category
- Example: 18/20 users (90%) created "Getting Started" category
-
Pairwise agreement: Average similarity across all card pairs
- Formula: Sum(all pairwise similarities) / Number of pairs
- High score (>70%) = strong consensus
- Low score (<50%) = weak consensus, need refinement
When consensus is low:
- Cards may be ambiguous (clarify labels)
- Users have different mental models (consider multiple navigation paths)
- Category is too broad (split into subcategories)
Outlier Cards
What they are: Cards that don't fit anywhere consistently
How to identify: Low similarity with all other cards (<30% with any card)
Common reasons:
- Card label is unclear → Rewrite card
- Content doesn't belong in product → Remove
- Content is unique → Create standalone category or utility link
Example: "Billing" card — 15 users put it in "Settings", 3 in "Account", 2 didn't categorize it
- Action: Clarify if "Billing" is settings (configuration) or account (transactions)
2. Taxonomy Design Principles
Mutually Exclusive, Collectively Exhaustive (MECE)
Principle: Categories don't overlap AND cover all content
Mutually exclusive: Each item belongs to exactly ONE category
- Bad: "Products" and "Best Sellers" (best sellers are also products — overlap)
- Good: "Products" (all) and "Featured" (separate facet or tag)
Collectively exhaustive: Every item has a category
- Bad: Categories: "Electronics", "Clothing" — but you also sell "Books" (gap)
- Good: Add "Books" OR create "Other" catch-all
Testing MECE:
- List all content items
- Try to categorize each
- If item fits >1 category → not mutually exclusive
- If item fits 0 categories → not collectively exhaustive
Polyhierarchy vs. Faceted Classification
Polyhierarchy: Item can live in multiple places in hierarchy
- Example: "iPhone case" could be in:
- Electronics > Accessories > Phone Accessories
- Gifts > Under $50 > Tech Gifts
- Pro: Matches multiple user mental models
- Con: Confusing (where is "canonical" location?), hard to maintain
Faceted classification: Item has ONE location, multiple orthogonal attributes
- Example: "iPhone case" is in Electronics (primary category)
- Facet 1: Category = Electronics
- Facet 2: Price = Under $50
- Facet 3: Use Case = Gifts
- Pro: Clear, flexible filtering, scalable
- Con: Requires good facet design
When to use each:
- Polyhierarchy: Small content sets (<500 items), clear user need for multiple paths
- Faceted: Large content sets (>500 items), many attributes, users need flexible filtering
Controlled Vocabulary vs. Folksonomy
Controlled vocabulary: Preset tags, curated by admins
- Example: "Authentication", "API", "Database" (exact tags, no variations)
- Pro: Consistency, findability, no duplication ("Auth" vs "Authentication")
- Con: Requires maintenance, may miss user terminology
Folksonomy: User-generated tags, anyone can create
- Example: Users tag articles with whatever terms they want
- Pro: Emergent, captures user language, low maintenance
- Con: Inconsistent, duplicates, noise ("Auth", "Authentication", "auth", "Authn")
Hybrid approach (recommended):
- Controlled vocabulary for core categories and facets
- Folksonomy for supplementary tags (with moderation)
- Periodically review folksonomy tags → promote common ones to controlled vocabulary
Tag moderation:
- Merge synonyms: "Auth" → "Authentication"
- Remove noise: "asdf", "test"
- Suggest tags: When user types "auth", suggest "Authentication"
Category Size & Balance
Guideline: Aim for balanced category sizes (no one category dominates)
Red flags:
- One huge category: "Other" with 60% of items → need better taxonomy
- Many tiny categories: 20 categories, each with 2-5 items → over-categorization, consolidate
- Unbalanced tree: One branch 5 levels deep, others 2 levels → inconsistent complexity
Target distribution:
- Top-level categories: 5-9 categories
- Each category: Roughly equal # of items (within 2× of each other)
- If one category much larger: Split into subcategories
Example: E-commerce with 1000 products
- Bad: Electronics (600), Clothing (300), Books (80), Other (20)
- Good: Electronics (250), Clothing (250), Books (250), Home & Garden (250)
Taxonomy Evolution
Principle: Taxonomies grow and change — design for evolution
Strategies:
- Leave room for growth: Don't create 10 top-level categories if you'll need 15 next year
- Use "Other" temporarily: New category emerging but not big enough yet? Use "Other" until critical mass
- Versioning: Date taxonomy versions, track changes over time
- Deprecation: Don't delete categories immediately — mark "deprecated", redirect users, then remove after transition period
Example: Software product adding ML features
- Today: 20 ML-related articles scattered across "Advanced", "API", "Tutorials"
- Transition: Create "Machine Learning" subcategory under "Advanced"
- Future: 100 ML articles → Promote "Machine Learning" to top-level category
3. Navigation Depth & Breadth Optimization
Hick's Law & Choice Overload
Hick's Law: Decision time increases logarithmically with number of choices
Formula: Time = a + b × log₂(n + 1)
- More choices → longer time to decide
Implications for IA:
- 5-9 items per level: Sweet spot (Miller's "7±2")
- >12 items: Users feel overwhelmed, scan inefficiently
- <3 items: Feels unnecessarily nested
Example:
- 100 items, flat (1 level, 100 choices): Overwhelming
- 100 items, 2 levels (10 × 10): Manageable
- 100 items, 4 levels (3 × 3 × 3 × 4): Too many clicks
Optimal for 100 items: 3 levels (5 × 5 × 4) or (7 × 7 × 2)
The "3-Click Rule" Myth
Myth: Users abandon if content requires >3 clicks
Reality: Users tolerate clicks if:
- Progress is clear: Breadcrumbs, page titles show "getting closer"
- Information scent is strong: Each click brings them closer to goal (see Section 4)
- No dead ends: Every click leads somewhere useful
Research (UIE study): Users successfully completed tasks requiring 5-12 clicks when navigation was clear
Guideline: Minimize clicks, but prioritize clarity over absolute number
- Good: 5 clear, purposeful clicks
- Bad: 2 clicks but confusing labels, users backtrack
Breadth-First vs. Depth-First Navigation
Breadth-first (shallow, many top-level options):
- Structure: 10-15 top-level categories, 2-3 levels deep
- Best for: Browsing, exploration, users know general area but not exact item
- Example: News sites, e-commerce homepages
Depth-first (narrow, few top-level but deep):
- Structure: 3-5 top-level categories, 4-6 levels deep
- Best for: Specific lookup, expert users, hierarchical domains
- Example: Technical documentation, academic libraries
Hybrid (recommended for most):
- Structure: 5-7 top-level categories, 3-4 levels deep
- Supplement with: Search, filters, related links to "shortcut" across hierarchy
Progressive Disclosure
Principle: Start simple, reveal complexity on-demand
Techniques:
-
Hub-and-spoke: Overview page → Detailed pages
- Hub: "Getting Started" with 5 clear entry points
- Spokes: Detailed guides linked from hub
-
Accordion/Collapse: Hide detail until user expands
- Navigation: Show categories, hide subcategories until expanded
- Content: Show summary, expand for full text
-
Tiered navigation: Primary nav (always visible) + secondary nav (contextual)
- Primary: "Products", "Support", "About"
- Secondary (when in "Products"): "Electronics", "Clothing", "Books"
-
"More..." links: Show top N items, hide rest until "Show more" clicked
- Navigation: Top 5 categories visible, "+3 more" link expands
Anti-pattern: Mega-menus showing everything at once (overwhelming)
4. Information Scent & Findability
Information Scent
Definition: Cues that indicate whether a path will lead to desired information
Strong scent: Clear labels, descriptive headings, users click confidently Weak scent: Vague labels, users guess, backtrack often
Example:
- Weak scent: "Solutions" → What's in there? (generic)
- Strong scent: "Developer API Documentation" → Clear what's inside
Optimizing information scent:
-
Specific labels (not generic):
- Bad: "Resources" → Too vague
- Good: "Code Samples", "Video Tutorials", "White Papers" → Specific
-
Trigger words (match user vocabulary):
- Card sort reveals users say "How do I..." → Label category "How-To Guides"
- Users search "pricing" → Ensure "Pricing" in nav, not "Plans" or "Subscription"
-
Descriptive breadcrumbs:
- Bad: "Home > Section 1 > Page 3" → No meaning
- Good: "Home > Developer Docs > API Reference" → Clear path
-
Preview text: Show snippet of content under link
- Navigation item: "API Reference" + "Complete list of endpoints and parameters"
Findability Metrics
Key metrics to track:
-
Time to find: How long to locate content?
- Target: <30 sec for simple tasks, <2 min for complex
- Measurement: Task completion time in usability tests
-
Success rate: % of users who find content?
- Target: ≥70% (tree test), ≥80% (live site with search)
- Measurement: Tree test results, task success in usability tests
-
Search vs. browse: Do users search or navigate?
- Good: 40-60% browse, 40-60% search (both work)
- Bad: 90% search (navigation broken), 90% browse (search broken)
- Measurement: Analytics (search usage %, nav click-through)
-
Search refinement rate: % of searches that are refined?
- Target: <30% (users find on first search)
- Bad: >50% (users search, refine, search again → poor results)
- Measurement: Analytics (queries per session)
-
Bounce rate by entry point: % leaving immediately?
- Target: <40% for landing pages
- Bad: >60% (users don't find what they expected)
- Measurement: Analytics (bounce rate by page)
-
Navigation abandonment: % who start navigating, then leave?
- Target: <20%
- Bad: >40% (users get lost, give up)
- Measurement: Analytics (drop-off in navigation funnels)
Search vs. Navigation Trade-offs
When search is preferred:
- Large content sets (>5000 items)
- Users know exactly what they want ("lookup" mode)
- Diverse content types (hard to categorize consistently)
When navigation is preferred:
- Smaller content sets (<500 items)
- Users browsing, exploring ("discovery" mode)
- Hierarchical domains (clear parent-child relationships)
Best practice: Offer BOTH
- Navigation for discovery, context, exploration
- Search for lookup, speed, known-item finding
Optimizing search:
- Autocomplete: Suggest as user types
- Filters: Narrow results by category, date, type
- Best bets: Featured results for common queries
- Zero-results page: Suggest alternatives, show popular content
Optimizing navigation:
- Clear labels: Match user vocabulary (card sort insights)
- Faceted filters: Browse + filter combination
- Related links: Help users discover adjacent content
- Breadcrumbs: Show path, enable backtracking
5. Advanced Topics
Mental Models & User Research
Mental model: User's internal representation of how system works
Why it matters: Navigation should match user's mental model, not company's org chart
Researching mental models:
- Card sorting: Reveals how users group/label content
- User interviews: Ask "How would you organize this?" "What would you call this?"
- Tree testing: Validates if proposed structure matches mental model
- First-click testing: Where do users expect to find X?
Common mismatches:
- Company thinks: "Features" (technical view)
- Users think: "What can I do?" (task view)
- Solution: Rename to task-based labels ("Create Report", "Share Dashboard")
Example: SaaS product
- Internal (wrong): "Modules" → "Synergistic Solutions" → "Widget Management"
- User mental model (right): "Features" → "Reporting" → "Custom Reports"
Cross-Cultural IA
Challenge: Different cultures have different categorization preferences
Examples:
- Alphabetical: Works for Latin scripts, not ideographic (Chinese, Japanese)
- Color coding: Red = danger (Western), Red = luck (Chinese)
- Icons: Mailbox icon = email (US), doesn't translate (many countries have different mailbox designs)
Strategies:
- Localization testing: Card sort with target culture users
- Avoid culturally-specific metaphors: "Home run", "touchdown" (US sports)
- Simple, universal labels: "Home", "Search", "Help" (widely understood)
- Icons + text: Don't rely on icons alone
IA Governance
Problem: Taxonomy degrades over time without maintenance
Governance framework:
-
Roles:
- Content owner: Publishes content, assigns categories/tags
- Taxonomy owner: Maintains category structure, adds/removes categories
- IA steward: Monitors usage, recommends improvements
-
Processes:
- Quarterly review: Check taxonomy usage, identify issues
- Change request: How to propose new categories or restructure
- Deprecation: Process for removing outdated categories
- Tag moderation: Review user-generated tags, merge synonyms
-
Metrics to monitor:
- % content in "Other" or "Uncategorized" (should be <5%)
- Empty categories (no content) — remove or consolidate
- Oversized categories (>50% of content) — split into subcategories
-
Tools:
- CMS with taxonomy management
- Analytics to track usage
- Automated alerts (e.g., "Category X has no content")
Personalization & Dynamic IA
Concept: Navigation adapts to user
Approaches:
-
Audience-based: Show different nav for different user types
- "For Developers", "For Marketers", "For Executives"
-
History-based: Prioritize recently visited or frequently used
- "Recently Viewed", "Your Favorites"
-
Context-based: Show nav relevant to current task
- "Related Articles", "Next Steps"
-
Adaptive search: Results ranked by user's past behavior
Caution: Don't over-personalize
- Users need consistency to build mental model
- Personalization should augment, not replace, standard navigation
IA for Voice & AI Interfaces
Challenge: Traditional visual hierarchy doesn't work for voice
Strategies:
- Flat structure: No deep nesting (can't show menu)
- Natural language categories: "Where can I find information about X?" vs. "Navigate to Category > Subcategory"
- Conversational: "What would you like to do?" vs. "Select option 1, 2, or 3"
- Context-aware: Remember user's previous question, continue conversation
Example:
- Web: Home > Products > Electronics > Phones
- Voice: "Show me phones" → "Here are our top phone options..."
Summary
Card sorting reveals user mental models through similarity matrices, dendrograms, and consensus scores. Outliers indicate unclear content.
Taxonomy design follows MECE principle (mutually exclusive, collectively exhaustive). Use faceted classification for scale, controlled vocabulary for consistency, and plan for evolution.
Navigation optimization balances breadth (many choices) vs. depth (many clicks). Optimal: 5-9 items per level, 3-4 levels deep. Progressive disclosure reduces initial complexity.
Information scent guides users with clear labels, trigger words, and descriptive breadcrumbs. Track findability metrics: time to find (<30 sec), success rate (≥70%), search vs. browse balance (40-60% each).
Advanced techniques include mental model research (card sort, interviews), cross-cultural adaptation, governance frameworks, personalization, and voice interface design.
The goal: Users can predict where information lives and find it quickly, regardless of access method.