---
name: usability-frameworks
description: Usability testing methodology, Nielsen's heuristics, and usability metrics for evaluating user interfaces
---

# Usability Frameworks

Comprehensive frameworks and methodologies for planning, conducting, and analyzing usability tests to improve user experience.

## When to Use This Skill

**Auto-loaded by agents**:
- `research-ops` - For usability testing and heuristic evaluation

**Use when you need**:
- Planning usability tests
- Conducting user testing sessions
- Evaluating interface designs
- Identifying usability problems
- Testing prototypes or live products
- Applying Nielsen's heuristics
- Measuring usability metrics

## Core Concepts

### What is Usability Testing?

Usability testing is a method for evaluating a product by testing it with representative users. Users attempt to complete typical tasks while observers watch, listen, and take notes.

**Purpose**: Identify usability problems, discover opportunities for improvement, and learn about user behavior and preferences.

**When to use**:
- Before development (testing prototypes)
- During development (iterative testing)
- After launch (validation and optimization)
- Before major redesigns

### The Five Usability Quality Components (Jakob Nielsen)

1. **Learnability**: How easy is it for users to accomplish basic tasks the first time?
2. **Efficiency**: How quickly can users perform tasks once they've learned the design?
3. **Memorability**: Can users remember how to use it after time away?
4. **Errors**: How many errors do users make, how severe, and how easily can they recover?
5. **Satisfaction**: How pleasant is it to use the design?

## Usability Testing Methodologies

### 1. Moderated Testing

**Setup**: Researcher guides participants through tasks in real-time
**Location**: In-person or remote (video call)

**Best for**:
- Early-stage prototypes needing clarification
- Complex products requiring guidance
- Exploring "why" behind user behavior
- Uncovering emotional reactions

**Process**:
1. Welcome and set expectations
2. Pre-task questions (background, experience)
3. Task scenarios with think-aloud protocol
4. Post-task questions and discussion
5. Wrap-up and thank you

**Advantages**:
- Rich qualitative insights
- Can probe deeper into issues
- Observe non-verbal cues
- Clarify misunderstandings immediately

**Limitations**:
- More time-intensive (30-60 min per session)
- Researcher bias possible
- Smaller sample sizes
- Scheduling logistics

### 2. Unmoderated Testing

**Setup**: Participants complete tasks independently, recorded for later review
**Location**: Remote, on participant's own schedule

**Best for**:
- Mature products with clear tasks
- Large sample sizes needed
- Quick turnaround required
- Benchmarking and metrics

**Process**:
1. Automated instructions and consent
2. Participants record screen/audio while completing tasks
3. Automated post-task surveys
4. Researcher reviews recordings later

**Advantages**:
- Faster data collection
- Larger sample sizes
- More natural environment
- Lower cost per participant

**Limitations**:
- Can't probe or clarify
- May miss nuanced insights
- Technical issues harder to resolve
- Participants may skip think-aloud

### 3. Hybrid Approaches

**Combination methods**:
- Moderated first impressions + unmoderated task completion
- Unmoderated testing + follow-up interviews with interesting cases
- Moderated pilot + unmoderated scale testing

## Nielsen's 10 Usability Heuristics

Quick reference for evaluating interfaces. See `references/nielsens-10-heuristics.md` for detailed explanations and examples.

1. **Visibility of system status** - Keep users informed
2. **Match between system and real world** - Speak users' language
3. **User control and freedom** - Provide escape hatches
4. **Consistency and standards** - Follow platform conventions
5. **Error prevention** - Prevent problems before they occur
6. **Recognition rather than recall** - Minimize memory load
7. **Flexibility and efficiency of use** - Accelerators for experts
8. **Aesthetic and minimalist design** - Remove irrelevant information
9. **Help users recognize, diagnose, and recover from errors** - Plain language error messages
10. **Help and documentation** - Provide when needed

## Think-Aloud Protocol

### What It Is

Participants verbalize their thoughts while completing tasks, providing real-time insight into their mental model.

### Types

**Concurrent think-aloud**: Speak while performing tasks
- More natural thought flow
- May affect task performance slightly

**Retrospective think-aloud**: Review recording and explain thinking after
- Doesn't disrupt natural behavior
- May forget or rationalize thoughts

### Facilitating Think-Aloud

**Prompts to use**:
- "What are you thinking right now?"
- "What are you looking for?"
- "What would you expect to happen?"
- "Is this what you expected?"

**Don't**:
- Ask leading questions
- Provide hints or solutions
- Interrupt natural flow too often
- Make participants feel tested

See `references/think-aloud-protocol-guide.md` for detailed facilitation techniques.

## Task Scenario Design

Good task scenarios are critical to meaningful usability test results.

### Characteristics of Good Task Scenarios

**Realistic**: Based on actual user goals
**Specific**: Clear endpoint/success criteria
**Self-contained**: Provide all necessary context
**Actionable**: Clear starting point
**Not prescriptive**: Don't tell them how to do it

### Example Transformation

**Poor**: "Click on the 'My Account' link and change your password"
- Too prescriptive, tells them exactly where to click

**Good**: "You've heard about recent security breaches and want to make your account more secure. Update your account to use a stronger password."
- Realistic motivation, clear goal, doesn't prescribe path

### Task Complexity Levels

**Simple tasks** (1-2 steps): Establish baseline usability
**Medium tasks** (3-5 steps): Test core workflows
**Complex tasks** (6+ steps): Evaluate overall experience and error recovery

See `assets/task-scenario-template.md` for ready-to-use templates.

## Severity Rating Framework

Not all usability issues are equal. Prioritize fixes based on severity.

### Three-Factor Severity Rating

**Frequency**: How often does this issue occur?
- High: > 50% of users encounter
- Medium: 10-50% encounter
- Low: < 10% encounter

**Impact**: When it occurs, how badly does it affect users?
- High: Prevents task completion / causes data loss
- Medium: Causes frustration or delays
- Low: Minor annoyance

**Persistence**: Do users overcome it with experience?
- High: Problem doesn't go away
- Medium: Users learn to avoid/work around
- Low: One-time problem only

### Combined Severity Ratings

**Critical** (P0): High frequency + High impact
**Serious** (P1): High frequency + Medium impact, OR Medium frequency + High impact
**Moderate** (P2): High frequency + Low impact, OR Medium frequency + Medium impact, OR Low frequency + High impact
**Minor** (P3): Everything else

See `assets/severity-rating-guide.md` for detailed rating criteria and examples.

## Usability Metrics

### Quantitative Metrics

**Task Success Rate**: % of participants who complete task successfully
- Binary: Did they complete it? (yes/no)
- Partial credit: Did they complete most of it?

**Time on Task**: How long to complete (for successful completions)
- Compare to baseline or competitor benchmarks

**Error Rate**: Number of errors per task
- Define what counts as an error for each task

**Clicks/Taps to Task Completion**: Efficiency measure
- More relevant for well-defined tasks

### Standardized Questionnaires

**SUS (System Usability Scale)**:
- 10 questions, 5-point Likert scale
- Score 0-100 (industry avg ~68)
- Quick, reliable, easy to administer
- Good for comparing versions or benchmarking

**UMUX (Usability Metric for User Experience)**:
- 4 questions, lighter than SUS
- Similar reliability
- Faster for participants

**SEQ (Single Ease Question)**:
- "Overall, how difficult or easy was the task to complete?" (1-7)
- One question per task
- Immediate subjective difficulty rating

**Other scales**:
- SUPR-Q (for websites)
- PSSUQ (post-study)
- NASA-TLX (cognitive load)

### Qualitative Insights

**Observed behaviors**:
- Hesitations and confusion
- Error patterns
- Unexpected paths
- Verbal frustrations

**Verbalized thoughts** (think-aloud):
- Mental model mismatches
- Expectation violations
- Pleasantly surprising discoveries

## Sample Size Guidelines

### For Qualitative Insights

**Nielsen's recommendation**: 5 users finds ~85% of usability problems
- Diminishing returns after 5
- Run 3+ small rounds instead of 1 large round
- Iterate between rounds

**Reality check**:
- 5 is a minimum, not ideal
- Complex products may need 8-10
- Multiple user types need 5 each

### For Quantitative Metrics

**Benchmarking**: 20+ users per user group
**A/B testing**: Depends on effect size and desired confidence
**Statistical significance**: Use power analysis calculators

## Planning Your Usability Test

### 1. Define Objectives

What decisions will this research inform?
- Redesign priorities?
- Feature cut decisions?
- Success of recent changes?

### 2. Identify User Segments

Who needs to be tested?
- New vs. experienced users?
- Different roles or use cases?
- Different devices or contexts?

### 3. Select Tasks

What tasks represent success?
- Most critical user goals
- Most frequent tasks
- Recently changed features
- Known problem areas

### 4. Choose Methodology

Moderated, unmoderated, or hybrid?
- Consider timeline, budget, research questions

### 5. Create Test Script

See `assets/usability-test-script-template.md` for a ready-to-use structure including:
- Welcome and consent
- Background questions
- Task instructions
- Probing questions
- Wrap-up

### 6. Recruit Participants

- Define screening criteria
- Aim for 5-10 per user segment
- Plan for no-shows (recruit 20% extra)
- Offer appropriate incentives

### 7. Conduct Pilot Test

- Test with colleague or friend
- Validate timing
- Check recording setup
- Refine unclear tasks

### 8. Run Sessions

- Stay neutral and encouraging
- Observe without interfering
- Take detailed notes
- Record if permitted

### 9. Analyze and Synthesize

- Code issues by severity
- Identify patterns across participants
- Link issues to heuristics violated
- Quantify task success and time

### 10. Report and Recommend

- Prioritized issue list
- Video clips of critical issues
- Recommendations with rationale
- Quick wins vs. strategic fixes

## Integration with Product Development

### When to Test

**Discovery phase**: Test competitors or analogous products
**Concept phase**: Test paper prototypes or wireframes
**Design phase**: Test high-fidelity mockups
**Development phase**: Test working builds iteratively
**Pre-launch**: Validate before release
**Post-launch**: Identify optimization opportunities

### Continuous Usability Testing

**Build it into your process**:
- Weekly or bi-weekly test sessions
- Rotating focus (new features, established flows, mobile vs. desktop)
- Standing recruiting panel
- Lightweight reporting to team

## Ready-to-Use Templates

We provide templates to accelerate your usability testing:

### In `assets/`:
- **usability-test-script-template.md**: Complete moderator script structure
- **task-scenario-template.md**: Framework for creating effective task scenarios
- **severity-rating-guide.md**: Detailed criteria for rating usability issues

### In `references/`:
- **nielsens-10-heuristics.md**: Deep dive into each heuristic with examples
- **think-aloud-protocol-guide.md**: Advanced facilitation techniques and troubleshooting

## Common Pitfalls to Avoid

1. **Leading participants**: "Was that easy?" → "How would you describe that experience?"
2. **Testing the wrong tasks**: Tasks that aren't real user goals
3. **Over-explaining**: Let users struggle and discover issues naturally
4. **Ignoring severity**: Fixing cosmetic issues while critical issues remain
5. **Testing too late**: After it's expensive to change
6. **Not iterating**: One-and-done testing instead of continuous improvement
7. **Confusing usability with preference**: "I like green" ≠ usability issue
8. **Sample bias**: Testing only power users or only complete novices

## Further Learning

**Books**:
- "Rocket Surgery Made Easy" by Steve Krug
- "Handbook of Usability Testing" by Jeffrey Rubin
- "Moderating Usability Tests" by Joseph Dumas

**Online resources**:
- Nielsen Norman Group articles
- Usability.gov
- Baymard Institute research

---

This skill provides the foundation for conducting effective usability testing. Use the templates in `assets/` for quick starts and `references/` for deeper dives into specific techniques.