Files
2025-11-30 08:58:08 +08:00

12 KiB

name, description
name description
usability-frameworks Usability testing methodology, Nielsen's heuristics, and usability metrics for evaluating user interfaces

Usability Frameworks

Comprehensive frameworks and methodologies for planning, conducting, and analyzing usability tests to improve user experience.

When to Use This Skill

Auto-loaded by agents:

  • research-ops - For usability testing and heuristic evaluation

Use when you need:

  • Planning usability tests
  • Conducting user testing sessions
  • Evaluating interface designs
  • Identifying usability problems
  • Testing prototypes or live products
  • Applying Nielsen's heuristics
  • Measuring usability metrics

Core Concepts

What is Usability Testing?

Usability testing is a method for evaluating a product by testing it with representative users. Users attempt to complete typical tasks while observers watch, listen, and take notes.

Purpose: Identify usability problems, discover opportunities for improvement, and learn about user behavior and preferences.

When to use:

  • Before development (testing prototypes)
  • During development (iterative testing)
  • After launch (validation and optimization)
  • Before major redesigns

The Five Usability Quality Components (Jakob Nielsen)

  1. Learnability: How easy is it for users to accomplish basic tasks the first time?
  2. Efficiency: How quickly can users perform tasks once they've learned the design?
  3. Memorability: Can users remember how to use it after time away?
  4. Errors: How many errors do users make, how severe, and how easily can they recover?
  5. Satisfaction: How pleasant is it to use the design?

Usability Testing Methodologies

1. Moderated Testing

Setup: Researcher guides participants through tasks in real-time Location: In-person or remote (video call)

Best for:

  • Early-stage prototypes needing clarification
  • Complex products requiring guidance
  • Exploring "why" behind user behavior
  • Uncovering emotional reactions

Process:

  1. Welcome and set expectations
  2. Pre-task questions (background, experience)
  3. Task scenarios with think-aloud protocol
  4. Post-task questions and discussion
  5. Wrap-up and thank you

Advantages:

  • Rich qualitative insights
  • Can probe deeper into issues
  • Observe non-verbal cues
  • Clarify misunderstandings immediately

Limitations:

  • More time-intensive (30-60 min per session)
  • Researcher bias possible
  • Smaller sample sizes
  • Scheduling logistics

2. Unmoderated Testing

Setup: Participants complete tasks independently, recorded for later review Location: Remote, on participant's own schedule

Best for:

  • Mature products with clear tasks
  • Large sample sizes needed
  • Quick turnaround required
  • Benchmarking and metrics

Process:

  1. Automated instructions and consent
  2. Participants record screen/audio while completing tasks
  3. Automated post-task surveys
  4. Researcher reviews recordings later

Advantages:

  • Faster data collection
  • Larger sample sizes
  • More natural environment
  • Lower cost per participant

Limitations:

  • Can't probe or clarify
  • May miss nuanced insights
  • Technical issues harder to resolve
  • Participants may skip think-aloud

3. Hybrid Approaches

Combination methods:

  • Moderated first impressions + unmoderated task completion
  • Unmoderated testing + follow-up interviews with interesting cases
  • Moderated pilot + unmoderated scale testing

Nielsen's 10 Usability Heuristics

Quick reference for evaluating interfaces. See references/nielsens-10-heuristics.md for detailed explanations and examples.

  1. Visibility of system status - Keep users informed
  2. Match between system and real world - Speak users' language
  3. User control and freedom - Provide escape hatches
  4. Consistency and standards - Follow platform conventions
  5. Error prevention - Prevent problems before they occur
  6. Recognition rather than recall - Minimize memory load
  7. Flexibility and efficiency of use - Accelerators for experts
  8. Aesthetic and minimalist design - Remove irrelevant information
  9. Help users recognize, diagnose, and recover from errors - Plain language error messages
  10. Help and documentation - Provide when needed

Think-Aloud Protocol

What It Is

Participants verbalize their thoughts while completing tasks, providing real-time insight into their mental model.

Types

Concurrent think-aloud: Speak while performing tasks

  • More natural thought flow
  • May affect task performance slightly

Retrospective think-aloud: Review recording and explain thinking after

  • Doesn't disrupt natural behavior
  • May forget or rationalize thoughts

Facilitating Think-Aloud

Prompts to use:

  • "What are you thinking right now?"
  • "What are you looking for?"
  • "What would you expect to happen?"
  • "Is this what you expected?"

Don't:

  • Ask leading questions
  • Provide hints or solutions
  • Interrupt natural flow too often
  • Make participants feel tested

See references/think-aloud-protocol-guide.md for detailed facilitation techniques.

Task Scenario Design

Good task scenarios are critical to meaningful usability test results.

Characteristics of Good Task Scenarios

Realistic: Based on actual user goals Specific: Clear endpoint/success criteria Self-contained: Provide all necessary context Actionable: Clear starting point Not prescriptive: Don't tell them how to do it

Example Transformation

Poor: "Click on the 'My Account' link and change your password"

  • Too prescriptive, tells them exactly where to click

Good: "You've heard about recent security breaches and want to make your account more secure. Update your account to use a stronger password."

  • Realistic motivation, clear goal, doesn't prescribe path

Task Complexity Levels

Simple tasks (1-2 steps): Establish baseline usability Medium tasks (3-5 steps): Test core workflows Complex tasks (6+ steps): Evaluate overall experience and error recovery

See assets/task-scenario-template.md for ready-to-use templates.

Severity Rating Framework

Not all usability issues are equal. Prioritize fixes based on severity.

Three-Factor Severity Rating

Frequency: How often does this issue occur?

  • High: > 50% of users encounter
  • Medium: 10-50% encounter
  • Low: < 10% encounter

Impact: When it occurs, how badly does it affect users?

  • High: Prevents task completion / causes data loss
  • Medium: Causes frustration or delays
  • Low: Minor annoyance

Persistence: Do users overcome it with experience?

  • High: Problem doesn't go away
  • Medium: Users learn to avoid/work around
  • Low: One-time problem only

Combined Severity Ratings

Critical (P0): High frequency + High impact Serious (P1): High frequency + Medium impact, OR Medium frequency + High impact Moderate (P2): High frequency + Low impact, OR Medium frequency + Medium impact, OR Low frequency + High impact Minor (P3): Everything else

See assets/severity-rating-guide.md for detailed rating criteria and examples.

Usability Metrics

Quantitative Metrics

Task Success Rate: % of participants who complete task successfully

  • Binary: Did they complete it? (yes/no)
  • Partial credit: Did they complete most of it?

Time on Task: How long to complete (for successful completions)

  • Compare to baseline or competitor benchmarks

Error Rate: Number of errors per task

  • Define what counts as an error for each task

Clicks/Taps to Task Completion: Efficiency measure

  • More relevant for well-defined tasks

Standardized Questionnaires

SUS (System Usability Scale):

  • 10 questions, 5-point Likert scale
  • Score 0-100 (industry avg ~68)
  • Quick, reliable, easy to administer
  • Good for comparing versions or benchmarking

UMUX (Usability Metric for User Experience):

  • 4 questions, lighter than SUS
  • Similar reliability
  • Faster for participants

SEQ (Single Ease Question):

  • "Overall, how difficult or easy was the task to complete?" (1-7)
  • One question per task
  • Immediate subjective difficulty rating

Other scales:

  • SUPR-Q (for websites)
  • PSSUQ (post-study)
  • NASA-TLX (cognitive load)

Qualitative Insights

Observed behaviors:

  • Hesitations and confusion
  • Error patterns
  • Unexpected paths
  • Verbal frustrations

Verbalized thoughts (think-aloud):

  • Mental model mismatches
  • Expectation violations
  • Pleasantly surprising discoveries

Sample Size Guidelines

For Qualitative Insights

Nielsen's recommendation: 5 users finds ~85% of usability problems

  • Diminishing returns after 5
  • Run 3+ small rounds instead of 1 large round
  • Iterate between rounds

Reality check:

  • 5 is a minimum, not ideal
  • Complex products may need 8-10
  • Multiple user types need 5 each

For Quantitative Metrics

Benchmarking: 20+ users per user group A/B testing: Depends on effect size and desired confidence Statistical significance: Use power analysis calculators

Planning Your Usability Test

1. Define Objectives

What decisions will this research inform?

  • Redesign priorities?
  • Feature cut decisions?
  • Success of recent changes?

2. Identify User Segments

Who needs to be tested?

  • New vs. experienced users?
  • Different roles or use cases?
  • Different devices or contexts?

3. Select Tasks

What tasks represent success?

  • Most critical user goals
  • Most frequent tasks
  • Recently changed features
  • Known problem areas

4. Choose Methodology

Moderated, unmoderated, or hybrid?

  • Consider timeline, budget, research questions

5. Create Test Script

See assets/usability-test-script-template.md for a ready-to-use structure including:

  • Welcome and consent
  • Background questions
  • Task instructions
  • Probing questions
  • Wrap-up

6. Recruit Participants

  • Define screening criteria
  • Aim for 5-10 per user segment
  • Plan for no-shows (recruit 20% extra)
  • Offer appropriate incentives

7. Conduct Pilot Test

  • Test with colleague or friend
  • Validate timing
  • Check recording setup
  • Refine unclear tasks

8. Run Sessions

  • Stay neutral and encouraging
  • Observe without interfering
  • Take detailed notes
  • Record if permitted

9. Analyze and Synthesize

  • Code issues by severity
  • Identify patterns across participants
  • Link issues to heuristics violated
  • Quantify task success and time

10. Report and Recommend

  • Prioritized issue list
  • Video clips of critical issues
  • Recommendations with rationale
  • Quick wins vs. strategic fixes

Integration with Product Development

When to Test

Discovery phase: Test competitors or analogous products Concept phase: Test paper prototypes or wireframes Design phase: Test high-fidelity mockups Development phase: Test working builds iteratively Pre-launch: Validate before release Post-launch: Identify optimization opportunities

Continuous Usability Testing

Build it into your process:

  • Weekly or bi-weekly test sessions
  • Rotating focus (new features, established flows, mobile vs. desktop)
  • Standing recruiting panel
  • Lightweight reporting to team

Ready-to-Use Templates

We provide templates to accelerate your usability testing:

In assets/:

  • usability-test-script-template.md: Complete moderator script structure
  • task-scenario-template.md: Framework for creating effective task scenarios
  • severity-rating-guide.md: Detailed criteria for rating usability issues

In references/:

  • nielsens-10-heuristics.md: Deep dive into each heuristic with examples
  • think-aloud-protocol-guide.md: Advanced facilitation techniques and troubleshooting

Common Pitfalls to Avoid

  1. Leading participants: "Was that easy?" → "How would you describe that experience?"
  2. Testing the wrong tasks: Tasks that aren't real user goals
  3. Over-explaining: Let users struggle and discover issues naturally
  4. Ignoring severity: Fixing cosmetic issues while critical issues remain
  5. Testing too late: After it's expensive to change
  6. Not iterating: One-and-done testing instead of continuous improvement
  7. Confusing usability with preference: "I like green" ≠ usability issue
  8. Sample bias: Testing only power users or only complete novices

Further Learning

Books:

  • "Rocket Surgery Made Easy" by Steve Krug
  • "Handbook of Usability Testing" by Jeffrey Rubin
  • "Moderating Usability Tests" by Joseph Dumas

Online resources:

  • Nielsen Norman Group articles
  • Usability.gov
  • Baymard Institute research

This skill provides the foundation for conducting effective usability testing. Use the templates in assets/ for quick starts and references/ for deeper dives into specific techniques.