12 KiB
name, description
| name | description |
|---|---|
| usability-frameworks | Usability testing methodology, Nielsen's heuristics, and usability metrics for evaluating user interfaces |
Usability Frameworks
Comprehensive frameworks and methodologies for planning, conducting, and analyzing usability tests to improve user experience.
When to Use This Skill
Auto-loaded by agents:
research-ops- For usability testing and heuristic evaluation
Use when you need:
- Planning usability tests
- Conducting user testing sessions
- Evaluating interface designs
- Identifying usability problems
- Testing prototypes or live products
- Applying Nielsen's heuristics
- Measuring usability metrics
Core Concepts
What is Usability Testing?
Usability testing is a method for evaluating a product by testing it with representative users. Users attempt to complete typical tasks while observers watch, listen, and take notes.
Purpose: Identify usability problems, discover opportunities for improvement, and learn about user behavior and preferences.
When to use:
- Before development (testing prototypes)
- During development (iterative testing)
- After launch (validation and optimization)
- Before major redesigns
The Five Usability Quality Components (Jakob Nielsen)
- Learnability: How easy is it for users to accomplish basic tasks the first time?
- Efficiency: How quickly can users perform tasks once they've learned the design?
- Memorability: Can users remember how to use it after time away?
- Errors: How many errors do users make, how severe, and how easily can they recover?
- Satisfaction: How pleasant is it to use the design?
Usability Testing Methodologies
1. Moderated Testing
Setup: Researcher guides participants through tasks in real-time Location: In-person or remote (video call)
Best for:
- Early-stage prototypes needing clarification
- Complex products requiring guidance
- Exploring "why" behind user behavior
- Uncovering emotional reactions
Process:
- Welcome and set expectations
- Pre-task questions (background, experience)
- Task scenarios with think-aloud protocol
- Post-task questions and discussion
- Wrap-up and thank you
Advantages:
- Rich qualitative insights
- Can probe deeper into issues
- Observe non-verbal cues
- Clarify misunderstandings immediately
Limitations:
- More time-intensive (30-60 min per session)
- Researcher bias possible
- Smaller sample sizes
- Scheduling logistics
2. Unmoderated Testing
Setup: Participants complete tasks independently, recorded for later review Location: Remote, on participant's own schedule
Best for:
- Mature products with clear tasks
- Large sample sizes needed
- Quick turnaround required
- Benchmarking and metrics
Process:
- Automated instructions and consent
- Participants record screen/audio while completing tasks
- Automated post-task surveys
- Researcher reviews recordings later
Advantages:
- Faster data collection
- Larger sample sizes
- More natural environment
- Lower cost per participant
Limitations:
- Can't probe or clarify
- May miss nuanced insights
- Technical issues harder to resolve
- Participants may skip think-aloud
3. Hybrid Approaches
Combination methods:
- Moderated first impressions + unmoderated task completion
- Unmoderated testing + follow-up interviews with interesting cases
- Moderated pilot + unmoderated scale testing
Nielsen's 10 Usability Heuristics
Quick reference for evaluating interfaces. See references/nielsens-10-heuristics.md for detailed explanations and examples.
- Visibility of system status - Keep users informed
- Match between system and real world - Speak users' language
- User control and freedom - Provide escape hatches
- Consistency and standards - Follow platform conventions
- Error prevention - Prevent problems before they occur
- Recognition rather than recall - Minimize memory load
- Flexibility and efficiency of use - Accelerators for experts
- Aesthetic and minimalist design - Remove irrelevant information
- Help users recognize, diagnose, and recover from errors - Plain language error messages
- Help and documentation - Provide when needed
Think-Aloud Protocol
What It Is
Participants verbalize their thoughts while completing tasks, providing real-time insight into their mental model.
Types
Concurrent think-aloud: Speak while performing tasks
- More natural thought flow
- May affect task performance slightly
Retrospective think-aloud: Review recording and explain thinking after
- Doesn't disrupt natural behavior
- May forget or rationalize thoughts
Facilitating Think-Aloud
Prompts to use:
- "What are you thinking right now?"
- "What are you looking for?"
- "What would you expect to happen?"
- "Is this what you expected?"
Don't:
- Ask leading questions
- Provide hints or solutions
- Interrupt natural flow too often
- Make participants feel tested
See references/think-aloud-protocol-guide.md for detailed facilitation techniques.
Task Scenario Design
Good task scenarios are critical to meaningful usability test results.
Characteristics of Good Task Scenarios
Realistic: Based on actual user goals Specific: Clear endpoint/success criteria Self-contained: Provide all necessary context Actionable: Clear starting point Not prescriptive: Don't tell them how to do it
Example Transformation
Poor: "Click on the 'My Account' link and change your password"
- Too prescriptive, tells them exactly where to click
Good: "You've heard about recent security breaches and want to make your account more secure. Update your account to use a stronger password."
- Realistic motivation, clear goal, doesn't prescribe path
Task Complexity Levels
Simple tasks (1-2 steps): Establish baseline usability Medium tasks (3-5 steps): Test core workflows Complex tasks (6+ steps): Evaluate overall experience and error recovery
See assets/task-scenario-template.md for ready-to-use templates.
Severity Rating Framework
Not all usability issues are equal. Prioritize fixes based on severity.
Three-Factor Severity Rating
Frequency: How often does this issue occur?
- High: > 50% of users encounter
- Medium: 10-50% encounter
- Low: < 10% encounter
Impact: When it occurs, how badly does it affect users?
- High: Prevents task completion / causes data loss
- Medium: Causes frustration or delays
- Low: Minor annoyance
Persistence: Do users overcome it with experience?
- High: Problem doesn't go away
- Medium: Users learn to avoid/work around
- Low: One-time problem only
Combined Severity Ratings
Critical (P0): High frequency + High impact Serious (P1): High frequency + Medium impact, OR Medium frequency + High impact Moderate (P2): High frequency + Low impact, OR Medium frequency + Medium impact, OR Low frequency + High impact Minor (P3): Everything else
See assets/severity-rating-guide.md for detailed rating criteria and examples.
Usability Metrics
Quantitative Metrics
Task Success Rate: % of participants who complete task successfully
- Binary: Did they complete it? (yes/no)
- Partial credit: Did they complete most of it?
Time on Task: How long to complete (for successful completions)
- Compare to baseline or competitor benchmarks
Error Rate: Number of errors per task
- Define what counts as an error for each task
Clicks/Taps to Task Completion: Efficiency measure
- More relevant for well-defined tasks
Standardized Questionnaires
SUS (System Usability Scale):
- 10 questions, 5-point Likert scale
- Score 0-100 (industry avg ~68)
- Quick, reliable, easy to administer
- Good for comparing versions or benchmarking
UMUX (Usability Metric for User Experience):
- 4 questions, lighter than SUS
- Similar reliability
- Faster for participants
SEQ (Single Ease Question):
- "Overall, how difficult or easy was the task to complete?" (1-7)
- One question per task
- Immediate subjective difficulty rating
Other scales:
- SUPR-Q (for websites)
- PSSUQ (post-study)
- NASA-TLX (cognitive load)
Qualitative Insights
Observed behaviors:
- Hesitations and confusion
- Error patterns
- Unexpected paths
- Verbal frustrations
Verbalized thoughts (think-aloud):
- Mental model mismatches
- Expectation violations
- Pleasantly surprising discoveries
Sample Size Guidelines
For Qualitative Insights
Nielsen's recommendation: 5 users finds ~85% of usability problems
- Diminishing returns after 5
- Run 3+ small rounds instead of 1 large round
- Iterate between rounds
Reality check:
- 5 is a minimum, not ideal
- Complex products may need 8-10
- Multiple user types need 5 each
For Quantitative Metrics
Benchmarking: 20+ users per user group A/B testing: Depends on effect size and desired confidence Statistical significance: Use power analysis calculators
Planning Your Usability Test
1. Define Objectives
What decisions will this research inform?
- Redesign priorities?
- Feature cut decisions?
- Success of recent changes?
2. Identify User Segments
Who needs to be tested?
- New vs. experienced users?
- Different roles or use cases?
- Different devices or contexts?
3. Select Tasks
What tasks represent success?
- Most critical user goals
- Most frequent tasks
- Recently changed features
- Known problem areas
4. Choose Methodology
Moderated, unmoderated, or hybrid?
- Consider timeline, budget, research questions
5. Create Test Script
See assets/usability-test-script-template.md for a ready-to-use structure including:
- Welcome and consent
- Background questions
- Task instructions
- Probing questions
- Wrap-up
6. Recruit Participants
- Define screening criteria
- Aim for 5-10 per user segment
- Plan for no-shows (recruit 20% extra)
- Offer appropriate incentives
7. Conduct Pilot Test
- Test with colleague or friend
- Validate timing
- Check recording setup
- Refine unclear tasks
8. Run Sessions
- Stay neutral and encouraging
- Observe without interfering
- Take detailed notes
- Record if permitted
9. Analyze and Synthesize
- Code issues by severity
- Identify patterns across participants
- Link issues to heuristics violated
- Quantify task success and time
10. Report and Recommend
- Prioritized issue list
- Video clips of critical issues
- Recommendations with rationale
- Quick wins vs. strategic fixes
Integration with Product Development
When to Test
Discovery phase: Test competitors or analogous products Concept phase: Test paper prototypes or wireframes Design phase: Test high-fidelity mockups Development phase: Test working builds iteratively Pre-launch: Validate before release Post-launch: Identify optimization opportunities
Continuous Usability Testing
Build it into your process:
- Weekly or bi-weekly test sessions
- Rotating focus (new features, established flows, mobile vs. desktop)
- Standing recruiting panel
- Lightweight reporting to team
Ready-to-Use Templates
We provide templates to accelerate your usability testing:
In assets/:
- usability-test-script-template.md: Complete moderator script structure
- task-scenario-template.md: Framework for creating effective task scenarios
- severity-rating-guide.md: Detailed criteria for rating usability issues
In references/:
- nielsens-10-heuristics.md: Deep dive into each heuristic with examples
- think-aloud-protocol-guide.md: Advanced facilitation techniques and troubleshooting
Common Pitfalls to Avoid
- Leading participants: "Was that easy?" → "How would you describe that experience?"
- Testing the wrong tasks: Tasks that aren't real user goals
- Over-explaining: Let users struggle and discover issues naturally
- Ignoring severity: Fixing cosmetic issues while critical issues remain
- Testing too late: After it's expensive to change
- Not iterating: One-and-done testing instead of continuous improvement
- Confusing usability with preference: "I like green" ≠ usability issue
- Sample bias: Testing only power users or only complete novices
Further Learning
Books:
- "Rocket Surgery Made Easy" by Steve Krug
- "Handbook of Usability Testing" by Jeffrey Rubin
- "Moderating Usability Tests" by Joseph Dumas
Online resources:
- Nielsen Norman Group articles
- Usability.gov
- Baymard Institute research
This skill provides the foundation for conducting effective usability testing. Use the templates in assets/ for quick starts and references/ for deeper dives into specific techniques.