664 lines
20 KiB
Markdown
664 lines
20 KiB
Markdown
# ScholarEval Evaluation Framework
|
|
|
|
## Overview
|
|
|
|
This document provides detailed evaluation criteria, rubrics, and quality indicators for each dimension of the ScholarEval framework. Use these standards when conducting systematic evaluations of scholarly work.
|
|
|
|
---
|
|
|
|
## Dimension 1: Problem Formulation & Research Questions
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- Research question is specific, measurable, and clearly articulated
|
|
- Problem addresses significant gap in literature with high impact potential
|
|
- Scope is appropriate and feasible within constraints
|
|
- Novel contribution is clearly differentiated from existing work
|
|
- Theoretical or practical significance is compellingly justified
|
|
|
|
**Good (4):**
|
|
- Research question is clear with minor ambiguities
|
|
- Problem is relevant with moderate impact potential
|
|
- Scope is generally appropriate with minor feasibility concerns
|
|
- Contribution is identifiable though not groundbreaking
|
|
- Significance is adequately justified
|
|
|
|
**Adequate (3):**
|
|
- Research question is present but lacks specificity
|
|
- Problem relevance is unclear or incremental
|
|
- Scope may be too broad or narrow
|
|
- Contribution is unclear or overlaps heavily with existing work
|
|
- Significance justification is weak
|
|
|
|
**Needs Improvement (2):**
|
|
- Research question is vague or poorly defined
|
|
- Problem lacks clear relevance or significance
|
|
- Scope is inappropriate or infeasible
|
|
- Contribution is not articulated
|
|
- No clear justification for significance
|
|
|
|
**Poor (1):**
|
|
- No clear research question
|
|
- Problem is trivial or irrelevant
|
|
- Scope is fundamentally flawed
|
|
- No identifiable contribution
|
|
- No significance justification
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Is the research question clearly stated?
|
|
- [ ] Can the question be answered with the proposed approach?
|
|
- [ ] Is the problem significant to the field?
|
|
- [ ] Is the scope feasible within resource constraints?
|
|
- [ ] Is the novelty/contribution clearly articulated?
|
|
- [ ] Are key assumptions explicitly stated?
|
|
- [ ] Are success criteria or expected outcomes defined?
|
|
|
|
---
|
|
|
|
## Dimension 2: Literature Review
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- Comprehensive coverage of relevant literature across key areas
|
|
- Critical synthesis identifying patterns, contradictions, and gaps
|
|
- Literature is current (majority from last 3-5 years for rapidly evolving fields)
|
|
- Sources are authoritative and peer-reviewed
|
|
- Clear positioning of current work within scholarly conversation
|
|
- Identifies genuine research gaps that the work addresses
|
|
|
|
**Good (4):**
|
|
- Good coverage with minor gaps in key areas
|
|
- Mostly synthesis with some description
|
|
- Literature is mostly current with some older foundational works
|
|
- Sources are generally authoritative
|
|
- Work positioning is present but could be stronger
|
|
- Research gaps are identified but may not be critical
|
|
|
|
**Adequate (3):**
|
|
- Partial coverage with notable gaps
|
|
- More descriptive summarization than synthesis
|
|
- Literature mix of current and dated sources
|
|
- Mix of authoritative and less rigorous sources
|
|
- Weak positioning within existing literature
|
|
- Research gaps are vague or questionable
|
|
|
|
**Needs Improvement (2):**
|
|
- Minimal coverage with major gaps
|
|
- Purely descriptive without synthesis
|
|
- Literature is largely outdated
|
|
- Sources lack authority or rigor
|
|
- Little to no positioning of current work
|
|
- No clear research gaps identified
|
|
|
|
**Poor (1):**
|
|
- Inadequate or absent literature review
|
|
- No synthesis
|
|
- Outdated or inappropriate sources
|
|
- No engagement with scholarly conversation
|
|
- No gap identification
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Does review cover all major relevant areas?
|
|
- [ ] Is literature synthesized rather than just summarized?
|
|
- [ ] Are sources current and authoritative?
|
|
- [ ] Are contrasting viewpoints presented?
|
|
- [ ] Are research gaps clearly identified?
|
|
- [ ] Is the current work positioned within existing literature?
|
|
- [ ] Is citation balance appropriate (not over-relying on few authors)?
|
|
- [ ] Are seminal/foundational works included?
|
|
|
|
### Common Issues
|
|
|
|
- **Insufficient coverage**: Missing key papers or research streams
|
|
- **Descriptive listing**: Summarizing papers sequentially without synthesis
|
|
- **Outdated sources**: Relying on literature more than 5-10 years old
|
|
- **Cherry-picking**: Only citing work that supports hypothesis
|
|
- **Poor organization**: Lack of thematic or conceptual structure
|
|
- **Weak gap identification**: Gaps are trivial or not actually gaps
|
|
|
|
---
|
|
|
|
## Dimension 3: Methodology & Research Design
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- Research design perfectly aligned with research questions
|
|
- Methods are rigorous, valid, and reliable
|
|
- Procedures are detailed enough for replication
|
|
- Controls, randomization, or triangulation appropriate
|
|
- Potential biases acknowledged and mitigated
|
|
- Ethical considerations addressed comprehensively
|
|
- Limitations are explicitly discussed
|
|
|
|
**Good (4):**
|
|
- Design is appropriate with minor alignment issues
|
|
- Methods are sound with small validity concerns
|
|
- Procedures are mostly replicable
|
|
- Some controls or validation present
|
|
- Major biases addressed
|
|
- Ethical considerations mentioned
|
|
- Some limitations discussed
|
|
|
|
**Adequate (3):**
|
|
- Design partially appropriate for questions
|
|
- Methods have notable validity concerns
|
|
- Procedures lack detail for full replication
|
|
- Limited controls or validation
|
|
- Bias mitigation is minimal
|
|
- Ethics addressed superficially
|
|
- Limitations minimally discussed
|
|
|
|
**Needs Improvement (2):**
|
|
- Design poorly aligned with research questions
|
|
- Methods have serious validity issues
|
|
- Procedures too vague to replicate
|
|
- No controls or validation
|
|
- Biases not addressed
|
|
- Ethical concerns not addressed
|
|
- No limitation discussion
|
|
|
|
**Poor (1):**
|
|
- Inappropriate or absent methodology
|
|
- Methods fundamentally flawed
|
|
- Not replicable
|
|
- No validity considerations
|
|
- No ethical considerations
|
|
- No acknowledgment of limitations
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Is methodology appropriate for research questions?
|
|
- [ ] Are procedures described in sufficient detail?
|
|
- [ ] Can the study be replicated from the description?
|
|
- [ ] Are validity and reliability addressed?
|
|
- [ ] Are potential biases identified and mitigated?
|
|
- [ ] Are ethical considerations discussed?
|
|
- [ ] Are limitations acknowledged?
|
|
- [ ] Is sample size justified (for quantitative work)?
|
|
- [ ] Are qualitative methods rigorous (if applicable)?
|
|
|
|
### Design-Specific Considerations
|
|
|
|
**Quantitative Studies:**
|
|
- Sample size with power analysis
|
|
- Control groups and randomization
|
|
- Measurement validity and reliability
|
|
- Statistical assumptions checking
|
|
|
|
**Qualitative Studies:**
|
|
- Sampling strategy and saturation
|
|
- Data collection procedures
|
|
- Coding and analysis framework
|
|
- Trustworthiness criteria (credibility, transferability, etc.)
|
|
|
|
**Mixed Methods:**
|
|
- Integration rationale
|
|
- Sequencing justification
|
|
- Data convergence strategy
|
|
|
|
---
|
|
|
|
## Dimension 4: Data Collection & Sources
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- Data sources are highly credible and appropriate
|
|
- Sample size is sufficient and well-justified
|
|
- Data collection procedures are rigorous and systematic
|
|
- Data quality controls are in place
|
|
- Sampling strategy ensures representativeness
|
|
- Missing data is minimal and handled appropriately
|
|
|
|
**Good (4):**
|
|
- Data sources are credible with minor concerns
|
|
- Sample size is adequate
|
|
- Collection procedures are systematic
|
|
- Some quality controls present
|
|
- Sampling is reasonable
|
|
- Missing data is addressed
|
|
|
|
**Adequate (3):**
|
|
- Data sources are acceptable but not optimal
|
|
- Sample size is marginal
|
|
- Collection procedures lack some rigor
|
|
- Limited quality controls
|
|
- Sampling may have bias concerns
|
|
- Missing data handling is basic
|
|
|
|
**Needs Improvement (2):**
|
|
- Data sources have credibility issues
|
|
- Sample size is insufficient
|
|
- Collection procedures are ad hoc
|
|
- No quality controls
|
|
- Sampling is clearly biased
|
|
- Missing data not addressed
|
|
|
|
**Poor (1):**
|
|
- Data sources are inappropriate or unreliable
|
|
- Sample size is inadequate
|
|
- Collection is unsystematic
|
|
- No quality considerations
|
|
- Sampling is fundamentally flawed
|
|
- Excessive missing data
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Are data sources credible and appropriate?
|
|
- [ ] Is sample size sufficient for conclusions?
|
|
- [ ] Is sampling strategy clearly described?
|
|
- [ ] Is the sample representative of target population?
|
|
- [ ] Are data collection procedures systematic?
|
|
- [ ] Are data quality controls described?
|
|
- [ ] Is missing data addressed?
|
|
- [ ] Are any potential data biases discussed?
|
|
|
|
---
|
|
|
|
## Dimension 5: Analysis & Interpretation
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- Analytical methods perfectly suited to data and questions
|
|
- Analysis is rigorous with appropriate techniques
|
|
- Results interpretation is logical and well-supported
|
|
- Alternative explanations are considered
|
|
- Claims are proportionate to evidence
|
|
- Assumptions are validated
|
|
- Analysis is transparent and reproducible
|
|
|
|
**Good (4):**
|
|
- Methods are appropriate with minor issues
|
|
- Analysis is sound
|
|
- Interpretation is mostly logical
|
|
- Some alternatives considered
|
|
- Claims generally match evidence
|
|
- Key assumptions checked
|
|
- Analysis is mostly transparent
|
|
|
|
**Adequate (3):**
|
|
- Methods are acceptable but not optimal
|
|
- Analysis has some technical issues
|
|
- Interpretation has logical gaps
|
|
- Alternatives not thoroughly explored
|
|
- Some claims exceed evidence
|
|
- Assumptions not fully validated
|
|
- Analysis transparency is limited
|
|
|
|
**Needs Improvement (2):**
|
|
- Methods are questionable for data/questions
|
|
- Analysis has significant technical flaws
|
|
- Interpretation is poorly supported
|
|
- No alternative explanations
|
|
- Claims significantly exceed evidence
|
|
- Assumptions not checked
|
|
- Analysis is not transparent
|
|
|
|
**Poor (1):**
|
|
- Methods are inappropriate
|
|
- Analysis is fundamentally flawed
|
|
- Interpretation is illogical
|
|
- No consideration of alternatives
|
|
- Claims unsupported by evidence
|
|
- No assumption validation
|
|
- Analysis is opaque
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Are analytical methods appropriate?
|
|
- [ ] Are statistical tests/qualitative methods properly applied?
|
|
- [ ] Are assumptions tested?
|
|
- [ ] Is interpretation logical and well-supported?
|
|
- [ ] Are alternative explanations considered?
|
|
- [ ] Do claims align with evidence strength?
|
|
- [ ] Is analysis reproducible from description?
|
|
- [ ] Are uncertainties acknowledged?
|
|
|
|
### Quantitative Analysis
|
|
|
|
- Appropriate statistical tests
|
|
- Assumptions checked (normality, homogeneity, etc.)
|
|
- Effect sizes reported
|
|
- Confidence intervals provided
|
|
- Multiple testing corrections (if applicable)
|
|
- Model diagnostics performed
|
|
|
|
### Qualitative Analysis
|
|
|
|
- Coding framework is clear
|
|
- Inter-rater reliability (if applicable)
|
|
- Saturation discussed
|
|
- Negative cases examined
|
|
- Member checking or validation
|
|
- Clear audit trail
|
|
|
|
---
|
|
|
|
## Dimension 6: Results & Findings
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- Results are clearly and comprehensively presented
|
|
- Visualizations are effective and appropriate
|
|
- Statistical or qualitative rigor is evident
|
|
- Key findings are highlighted effectively
|
|
- Results directly address research questions
|
|
- Patterns and relationships are clearly shown
|
|
- Negative and null results are reported
|
|
|
|
**Good (4):**
|
|
- Results are clear with minor presentation issues
|
|
- Visualizations are generally effective
|
|
- Rigor is present
|
|
- Main findings are identifiable
|
|
- Results mostly address questions
|
|
- Patterns are shown
|
|
- Some negative results included
|
|
|
|
**Adequate (3):**
|
|
- Results presentation is adequate but could be clearer
|
|
- Visualizations are basic or have issues
|
|
- Rigor is questionable in places
|
|
- Findings are present but not emphasized
|
|
- Partial alignment with questions
|
|
- Patterns are unclear
|
|
- Negative results may be omitted
|
|
|
|
**Needs Improvement (2):**
|
|
- Results presentation is unclear or confusing
|
|
- Visualizations are poor or misleading
|
|
- Lack of rigor
|
|
- Findings are difficult to identify
|
|
- Weak alignment with questions
|
|
- No clear patterns
|
|
- Only positive results shown
|
|
|
|
**Poor (1):**
|
|
- Results are poorly presented or absent
|
|
- Visualizations are inappropriate or missing
|
|
- No evidence of rigor
|
|
- Findings are unclear
|
|
- Results don't address questions
|
|
- No identifiable patterns
|
|
- Results appear selective
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Are results clearly presented?
|
|
- [ ] Do results directly address research questions?
|
|
- [ ] Are visualizations appropriate and effective?
|
|
- [ ] Are key findings highlighted?
|
|
- [ ] Are negative/null results reported?
|
|
- [ ] Is appropriate precision reported (p-values, CIs, effect sizes)?
|
|
- [ ] Are qualitative findings supported by data excerpts?
|
|
- [ ] Is there evidence of selective reporting?
|
|
|
|
### Presentation Quality
|
|
|
|
**Tables:**
|
|
- Clear labels and captions
|
|
- Appropriate precision
|
|
- Organized logically
|
|
- Not overly complex
|
|
|
|
**Figures:**
|
|
- Clear axes and legends
|
|
- Appropriate chart type
|
|
- Professional appearance
|
|
- Accessible (color-blind friendly)
|
|
|
|
**Text:**
|
|
- Highlights key findings
|
|
- Avoids redundancy with tables/figures
|
|
- Uses appropriate statistical language
|
|
|
|
---
|
|
|
|
## Dimension 7: Scholarly Writing & Presentation
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- Writing is clear, concise, and precise
|
|
- Organization is logical with excellent flow
|
|
- Academic tone is appropriate and consistent
|
|
- Grammar and mechanics are flawless
|
|
- Technical terms are used correctly
|
|
- Accessible to target audience
|
|
- Abstract/summary is comprehensive and accurate
|
|
|
|
**Good (4):**
|
|
- Writing is clear with minor awkwardness
|
|
- Organization is logical with good flow
|
|
- Tone is mostly appropriate
|
|
- Few grammar/mechanical errors
|
|
- Technical terms mostly correct
|
|
- Generally accessible
|
|
- Abstract is adequate
|
|
|
|
**Adequate (3):**
|
|
- Writing is understandable but has clarity issues
|
|
- Organization has some logical gaps
|
|
- Tone inconsistencies
|
|
- Noticeable grammar/mechanical errors
|
|
- Some technical term misuse
|
|
- Accessibility issues for target audience
|
|
- Abstract is incomplete or vague
|
|
|
|
**Needs Improvement (2):**
|
|
- Writing is often unclear or verbose
|
|
- Poor organization and flow
|
|
- Tone is inappropriate
|
|
- Frequent grammar/mechanical errors
|
|
- Technical terminology problems
|
|
- Not accessible to target audience
|
|
- Abstract is poor or missing
|
|
|
|
**Poor (1):**
|
|
- Writing is unclear and difficult to follow
|
|
- No clear organization
|
|
- Tone is inappropriate
|
|
- Pervasive grammar/mechanical errors
|
|
- Incorrect technical terminology
|
|
- Inaccessible
|
|
- No adequate abstract
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Is writing clear and concise?
|
|
- [ ] Is organization logical?
|
|
- [ ] Is tone appropriate for academic writing?
|
|
- [ ] Are grammar and mechanics correct?
|
|
- [ ] Are technical terms used appropriately?
|
|
- [ ] Is jargon explained when necessary?
|
|
- [ ] Does abstract accurately summarize the work?
|
|
- [ ] Are transitions between sections smooth?
|
|
- [ ] Is the target audience clear?
|
|
|
|
### Common Writing Issues
|
|
|
|
- **Wordiness**: Unnecessarily complex or lengthy prose
|
|
- **Passive voice overuse**: Reduces clarity and directness
|
|
- **Paragraph structure**: Lack of topic sentences or coherence
|
|
- **Redundancy**: Repeating information unnecessarily
|
|
- **Logical flow**: Poor transitions between ideas
|
|
- **Precision**: Vague or ambiguous language
|
|
- **Accessibility**: Too technical or not technical enough
|
|
|
|
---
|
|
|
|
## Dimension 8: Citations & References
|
|
|
|
### Quality Indicators
|
|
|
|
**Excellent (5):**
|
|
- All claims are appropriately cited
|
|
- Sources are authoritative and current
|
|
- Citations are accurate and complete
|
|
- Diverse perspectives are represented
|
|
- Citation format is consistent and correct
|
|
- Balance between self-citation and others
|
|
- Primary sources used appropriately
|
|
|
|
**Good (4):**
|
|
- Most claims are cited
|
|
- Sources are generally authoritative
|
|
- Few citation errors
|
|
- Reasonable diversity of sources
|
|
- Format is mostly consistent
|
|
- Citation balance is good
|
|
- Mix of primary and secondary sources
|
|
|
|
**Adequate (3):**
|
|
- Some claims lack citations
|
|
- Source quality is mixed
|
|
- Several citation errors
|
|
- Limited source diversity
|
|
- Format inconsistencies
|
|
- Citation balance issues
|
|
- Over-reliance on secondary sources
|
|
|
|
**Needs Improvement (2):**
|
|
- Many claims uncited
|
|
- Sources are questionable
|
|
- Numerous citation errors
|
|
- Narrow source base
|
|
- Format is inconsistent
|
|
- Excessive self-citation or narrow citing
|
|
- Inappropriate sources (e.g., only secondary)
|
|
|
|
**Poor (1):**
|
|
- Inadequate citations
|
|
- Unreliable sources
|
|
- Pervasive citation errors
|
|
- Minimal source diversity
|
|
- No consistent format
|
|
- Severe citation imbalance
|
|
- Inappropriate source types
|
|
|
|
### Assessment Checklist
|
|
|
|
- [ ] Are all factual claims cited?
|
|
- [ ] Are citations to primary sources when appropriate?
|
|
- [ ] Are sources authoritative and peer-reviewed?
|
|
- [ ] Is there balance in perspectives cited?
|
|
- [ ] Are citations accurate (authors, dates, pages)?
|
|
- [ ] Is citation format consistent?
|
|
- [ ] Are self-citations appropriate (typically <20%)?
|
|
- [ ] Are sources current (for time-sensitive topics)?
|
|
- [ ] Are classic/seminal works included where relevant?
|
|
|
|
### Citation Quality Assessment
|
|
|
|
**Source Types (in order of preference for most academic work):**
|
|
1. Peer-reviewed journal articles
|
|
2. Academic books from reputable publishers
|
|
3. Conference proceedings (field-dependent)
|
|
4. Technical reports from reputable institutions
|
|
5. Dissertations/theses
|
|
6. Preprints (with caution, field-dependent)
|
|
7. Grey literature (limited use)
|
|
8. Websites (rarely appropriate, except for factual data)
|
|
|
|
**Red Flags:**
|
|
- Wikipedia as a primary source
|
|
- Excessive self-citation (>30%)
|
|
- Only citing papers that support hypothesis
|
|
- Outdated sources when current ones exist
|
|
- Missing key papers in the field
|
|
- Citing abstracts only when full papers are available
|
|
- Inconsistent or incorrect citation format
|
|
|
|
---
|
|
|
|
## Cross-Cutting Considerations
|
|
|
|
### Reproducibility
|
|
|
|
Assess across dimensions:
|
|
- Are methods detailed enough to replicate?
|
|
- Are data and code available (or availability explained)?
|
|
- Are analysis steps transparent?
|
|
- Are materials/instruments specified?
|
|
|
|
### Ethics
|
|
|
|
Consider:
|
|
- IRB approval (for human subjects)
|
|
- Informed consent
|
|
- Privacy and confidentiality
|
|
- Conflicts of interest
|
|
- Research integrity
|
|
- Data sharing ethics
|
|
|
|
### Bias and Limitations
|
|
|
|
Evaluate whether:
|
|
- Potential biases are acknowledged
|
|
- Limitations are discussed honestly
|
|
- Boundary conditions are specified
|
|
- Generalizability is appropriately claimed
|
|
|
|
### Impact and Significance
|
|
|
|
Consider:
|
|
- Theoretical contribution
|
|
- Practical implications
|
|
- Policy relevance
|
|
- Methodological innovation
|
|
- Field advancement
|
|
|
|
---
|
|
|
|
## Scoring Guidelines
|
|
|
|
### Dimension Weighting (Suggested, Adjust by Context)
|
|
|
|
- Problem Formulation: 15%
|
|
- Literature Review: 15%
|
|
- Methodology: 20%
|
|
- Data Collection: 10%
|
|
- Analysis: 15%
|
|
- Results: 10%
|
|
- Writing: 10%
|
|
- Citations: 5%
|
|
|
|
### Overall Assessment Thresholds
|
|
|
|
- **Exceptional (4.5-5.0)**: Ready for top-tier publication
|
|
- **Strong (4.0-4.4)**: Publication-ready with minor revisions
|
|
- **Good (3.5-3.9)**: Major revisions required, promising work
|
|
- **Acceptable (3.0-3.4)**: Significant revisions needed
|
|
- **Weak (2.0-2.9)**: Fundamental issues, major rework required
|
|
- **Poor (<2.0)**: Not suitable for publication without complete revision
|
|
|
|
### Contextual Adjustments
|
|
|
|
Adjust standards based on:
|
|
- **Stage**: Proposal < Draft < Final submission
|
|
- **Venue**: Student thesis < Conference < Journal < Top-tier journal
|
|
- **Type**: Theoretical < Empirical < Meta-analysis
|
|
- **Field**: Standards vary by discipline
|
|
- **Purpose**: Educational < Professional < Publication
|
|
|
|
---
|
|
|
|
## Using This Framework
|
|
|
|
1. **Read the work thoroughly** before beginning evaluation
|
|
2. **Score each dimension** using the 5-point scale
|
|
3. **Document evidence** for each score with specific examples
|
|
4. **Consider context** and adjust expectations appropriately
|
|
5. **Synthesize findings** across dimensions
|
|
6. **Provide actionable feedback** prioritized by impact
|
|
7. **Balance criticism with recognition** of strengths
|
|
|
|
This framework is a guide, not a rigid checklist. Professional judgment should always be applied in context.
|