zhongwei/gh-lyndonkl-claude

Fork 0

Files

Zhongwei Li 41d9f6b189 Initial commit

2025-11-30 08:38:26 +08:00

13 KiB

Raw Blame History

Evaluation Rubrics Templates

Quick-start templates for purpose definition, criteria selection, scale design, descriptor writing, and rubric formats.

Workflow

Rubric Development Progress:
- [ ] Step 1: Define purpose and scope
- [ ] Step 2: Identify evaluation criteria
- [ ] Step 3: Design the scale
- [ ] Step 4: Write performance descriptors
- [ ] Step 5: Test and calibrate
- [ ] Step 6: Use and iterate

Step 1: Define purpose and scope

Use Purpose Definition Template to clarify evaluation context and constraints.

Step 2: Identify evaluation criteria

Brainstorm and prioritize quality dimensions using Criteria Identification Template.

Step 3: Design the scale

Select scale type and levels using Scale Selection Template.

Step 4: Write performance descriptors

Write clear, observable descriptors using Descriptor Writing Template.

Step 5: Test and calibrate

Conduct inter-rater reliability testing and refine rubric.

Step 6: Use and iterate

Apply rubric, collect feedback, revise as needed.

Purpose Definition Template

What are we evaluating?

Artifact type: [e.g., code pull requests, research proposals, design mockups, student essays]
Specific context: [e.g., internal code review, grant competition, course assignment]

Who will evaluate?

Number of evaluators: [Single reviewer or multiple?]
Evaluator expertise: [Subject matter experts, peers, instructors, automated systems]
Evaluator availability: [Time per evaluation? Total volume?]

Who are the evaluatees?

Audience: [Students, employees, vendors, applicants]
Skill level: [Novice, intermediate, expert]
Will they see rubric before evaluation? [Yes/No - if yes, rubric serves as guide]

What decisions depend on scores?

High stakes: [Pass/fail, hiring, funding, promotion, grades]
Medium stakes: [Feedback for improvement, prioritization, awards]
Low stakes: [Self-assessment, informal feedback]

Success criteria for rubric:

Enables consistent scoring across evaluators (inter-rater reliability >70%)
Provides actionable feedback for improvement
Takes reasonable time to use (target: X minutes per evaluation)
Acceptable to evaluators (not overly complex or rigid)
Acceptable to evaluatees (perceived as fair and transparent)

Criteria Identification Template

Brainstorming Quality Dimensions

Product criteria (artifact itself):

Correctness/Accuracy: [Is it right? Factually accurate? Meets requirements?]
Completeness: [Covers all necessary elements? No major gaps?]
Clarity: [Easy to understand? Well-organized? Clear communication?]
Quality/Craftsmanship: [Attention to detail? Polished? Professional?]
Originality/Creativity: [Novel approach? Innovative? Goes beyond expected?]
Performance: [Fast? Efficient? Scalable? Meets technical specs?]

Process criteria (how it was made):

Methodology: [Followed appropriate process? Research methods sound?]
Collaboration: [Teamwork? Communication? Used feedback?]
Iteration: [Multiple drafts? Refinement? Responsiveness to critique?]
Time management: [Completed on time? Paced work appropriately?]

Impact criteria (effects/outcomes):

Usability: [User-friendly? Accessible? Intuitive?]
Value: [Solves problem? Addresses need? Business impact?]
Learning demonstrated: [Shows understanding? Growth from previous work?]

Meta criteria (quality of quality):

Maintainability: [Can others work with this? Documented? Modular?]
Testability: [Can be verified? Validated? Measured?]
Extensibility: [Can be built upon? Flexible? Adaptable?]

Prioritization

Rate each candidate criterion:

Criterion	Importance (H/M/L)	Observable (Y/N)	Distinct from others (Y/N)	Include?
[Criterion 1]
[Criterion 2]
[Criterion 3]

Selection rules:

Must be High or Medium importance
Must be Observable (can two reviewers score consistently?)
Must be Distinct (not overlapping with other criteria)
Aim for 4-8 criteria (balance coverage vs. simplicity)

Final criteria (4-8 selected):

[Criterion]: [Brief definition]
[Criterion]: [Brief definition]
[Criterion]: [Brief definition]
[Criterion]: [Brief definition]

Scale Selection Template

Scale type options:

Numeric Scales

1-3 scale (Low/Medium/High)

Use when: Quick categorization, clear tiers sufficient
Levels: 1=Below standard, 2=Meets standard, 3=Exceeds standard

1-4 scale (Forced choice, no middle)

Use when: Want to avoid central tendency, need clear differentiation
Levels: 1=Poor, 2=Fair, 3=Good, 4=Excellent

1-5 scale (Most common, allows neutral)

Use when: General purpose, familiar to evaluators
Levels: 1=Poor, 2=Fair, 3=Adequate, 4=Good, 5=Excellent

1-10 scale (Fine gradations)

Use when: Large sample, need statistical analysis, can distinguish subtle differences
Levels: 1-2=Poor, 3-4=Fair, 5-6=Adequate, 7-8=Good, 9-10=Excellent

Qualitative Scales

Developmental: Novice → Developing → Proficient → Expert Standards-based: Below Standard → Approaching → Meets → Exceeds Competency: Not Yet Competent → Partially Competent → Competent → Highly Competent

Binary

Pass/Fail, Yes/No, Present/Absent

Use when: Compliance checks, minimum thresholds, clear criteria

Selected scale for this rubric: [Choose one]

Type: [Numeric 1-5, Qualitative, etc.]
Levels: [List with labels]
Rationale: [Why this scale fits purpose]

Descriptor Writing Template

For each criterion, write descriptors at each scale level.

Criterion: [Name]

Definition: [What does this criterion assess? 1-2 sentences]

Why it matters: [Importance to overall quality]

Scale descriptors: