Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:58:28 +08:00
commit 4086c83ff0
25 changed files with 2875 additions and 0 deletions

759
skills/ai-check/SKILL.md Normal file
View File

@@ -0,0 +1,759 @@
---
name: ai-check
description: "Detect AI/LLM-generated text patterns in research writing. Use when: (1) Reviewing manuscript drafts before submission, (2) Pre-commit validation of documentation, (3) Quality assurance checks on research artifacts, (4) Ensuring natural academic writing style, (5) Tracking writing authenticity over time. Analyzes grammar perfection, sentence uniformity, paragraph structure, word frequency (AI-typical words like 'delve', 'leverage', 'robust'), punctuation patterns, and transition word overuse."
allowed-tools: Read, Grep, Bash
version: 1.0.0
---
# AI-Generated Text Detection Skill
## Purpose
Detect patterns typical of LLM-generated text to ensure natural, human-authored academic writing. This skill helps maintain authenticity in research publications, dissertations, and documentation.
## When to Use This Skill
### Primary Use Cases
1. **Pre-Commit Validation** - Automatically check manuscripts and documentation before git commits
2. **Manuscript Review** - Validate academic writing before submission to journals or committees
3. **Quality Assurance** - Part of systematic QA workflow for research artifacts
4. **On-Demand Analysis** - Manual review of any text file for AI patterns
5. **Writing Evolution Tracking** - Monitor writing style changes over time
### Specific Scenarios
- Before submitting dissertation chapters to advisor
- Prior to journal article submission
- When reviewing team member contributions
- After making significant edits to documentation
- When suspicious patterns are noticed in writing
- During peer review or committee review preparation
## Detection Methodology
### 1. Grammar Pattern Analysis
**What We Check:**
- **Excessive Perfection**: Zero typos, missing commas, or minor errors throughout
- **Comma Placement**: Perfect comma usage in all complex sentences
- **Formal Register**: Consistent formal tone with no informal elements
- **Grammar Consistency**: No variations in grammatical choices
**Red Flags:**
- Absolutely no grammatical errors in 10+ pages
- Every semicolon and colon used perfectly
- No sentence fragments or run-ons even in appropriate contexts
- Overly formal language even in methods sections
**Human Writing Typically Has:**
- Occasional minor typos or comma splices
- Some inconsistency in formality
- Natural variations in grammar choices
- Context-appropriate informality
### 2. Sentence Structure Uniformity
**What We Check:**
- **Sentence Length Distribution**: Variation in sentence lengths
- **Structural Patterns**: Repetitive sentence structures
- **Complexity Variation**: Mix of simple, compound, complex sentences
- **Opening Patterns**: How sentences begin
**Red Flags:**
- Most sentences 15-25 words (AI sweet spot)
- Repetitive subject-verb-object patterns
- Every paragraph starts with topic sentence
- Excessive use of transition words at sentence starts
- Predictable sentence complexity patterns
**Human Writing Typically Has:**
- Wide variation (5-40+ word sentences)
- Unpredictable sentence structures
- Occasional fragments for emphasis
- Natural flow without forced transitions
### 3. Paragraph Structure Analysis
**What We Check:**
- **Paragraph Length**: Uniformity vs. natural variation
- **Structural Pattern**: Topic sentence + support + conclusion pattern
- **Information Flow**: Natural vs. algorithmic organization
- **Paragraph Transitions**: Connection between paragraphs
**Red Flags:**
- All paragraphs 4-6 sentences long
- Every paragraph follows same structure
- Mechanical transitions between paragraphs
- Perfectly balanced paragraph lengths
- No single-sentence paragraphs
**Human Writing Typically Has:**
- Paragraph length variation (1-10+ sentences)
- Structural diversity based on content
- Natural transitions
- Strategic use of short/long paragraphs for emphasis
### 4. Word Frequency Analysis (AI-Typical Words)
**High-Risk AI Words** (overused by LLMs):
**Verbs:**
- "delve" (rarely used by humans)
- "leverage" (business jargon)
- "utilize" (instead of "use")
- "facilitate" (overly formal)
- "demonstrate" (overused)
- "implement" (in non-technical contexts)
- "enhance" (marketing language)
**Adjectives:**
- "robust" (technical overuse)
- "comprehensive" (vague intensifier)
- "innovative" (buzzword)
- "cutting-edge" (cliché)
- "significant" (statistical overuse)
- "substantial" (formal overuse)
- "considerable" (formal overuse)
- "crucial" (intensity overuse)
**Transition Words** (overused):
- "furthermore" (very formal)
- "moreover" (archaic feeling)
- "additionally" (redundant)
- "consequently" (overused)
- "subsequently" (temporal overuse)
- "nevertheless" (formal overuse)
- "nonetheless" (synonym overuse)
**Phrases:**
- "it is important to note that"
- "it should be emphasized that"
- "a comprehensive analysis of"
- "in the context of"
- "with respect to"
- "in terms of"
- "in order to" (instead of "to")
**Detection Criteria:**
- Count frequency per 1000 words
- Compare to human academic writing baselines
- Flag if 3+ high-risk words per 1000 words
- Weight by word rarity (delve = high weight)
### 5. Punctuation Patterns
**What We Check:**
- **Semicolon Usage**: Frequency and correctness
- **Colon Usage**: Perfect usage patterns
- **Em-Dash Usage**: Consistent stylistic choices
- **Comma Patterns**: Perfection vs. natural variation
- **Ellipsis/Exclamation**: Absence in informal contexts
**Red Flags:**
- Excessive semicolon use (2+ per paragraph)
- Perfect colon usage throughout
- Consistent em-dash formatting (—)
- No missing commas anywhere
- Zero informal punctuation
**Human Writing Typically Has:**
- Inconsistent punctuation choices
- Occasional missing/extra commas
- Variable dash formatting (- vs -- vs —)
- Some informal punctuation where appropriate
## Confidence Scoring System
### Scoring Formula
**Overall Confidence** = Weighted average of:
- Grammar perfection: 20%
- Sentence uniformity: 25%
- Paragraph structure: 20%
- AI-typical words: 25%
- Punctuation patterns: 10%
Each metric scored 0-100, then combined with weights.
### Confidence Levels
#### Low Confidence (0-30%): Likely Human Writing
**Characteristics:**
- Natural sentence length variation (5-40+ words)
- Occasional grammatical imperfections
- Authentic voice and natural flow
- Domain-specific terminology used naturally
- Structural variety in paragraphs
- Minimal AI-typical words (0-2 per 1000 words)
**Action:** ✅ Writing appears authentic, no changes needed
#### Medium Confidence (30-70%): Possible AI Assistance
**Characteristics:**
- Some uniformity in sentence structure
- Mix of AI-typical and natural patterns
- May be human-edited AI output
- Overly formal in places
- Some transition word overuse
- 3-5 AI-typical words per 1000 words
**Action:** ⚠️ Review flagged sections, apply suggestions selectively
**Examples of Mixed Writing:**
- AI-generated first draft with heavy human editing
- Human writing that mimics academic formality excessively
- Non-native English speakers using formal templates
- Multiple authors with different styles
#### High Confidence (70-100%): Likely AI-Generated
**Characteristics:**
- Excessive uniformity across all metrics
- Multiple AI-typical word clusters
- Perfect grammar and punctuation throughout
- Artificial transition patterns
- Mechanical paragraph structure
- 6+ AI-typical words per 1000 words
**Action:** 🚫 Significant revision needed, rewrite in authentic voice
## Output Format
When running AI-check analysis, generate a comprehensive report:
### 1. Executive Summary
```
Overall Confidence Score: 65%
Status: MEDIUM - Possible AI assistance detected
Files Analyzed: 1
Total Words: 3,456
Recommendation: Review flagged sections
```
### 2. Metric Breakdown
```
Grammar Perfection: 85% (High - suspiciously few errors)
Sentence Uniformity: 72% (High - repetitive structures)
Paragraph Structure: 68% (Medium - some variation)
AI-Typical Words: 58% (Medium - 4.2 per 1000 words)
Punctuation Patterns: 45% (Low - natural variation)
```
### 3. Flagged Sections
```
Lines 45-67 (Confidence: 82%)
Pattern: Excessive transition words + uniform sentences
AI Words: "moreover", "furthermore", "leverage", "robust"
Lines 112-134 (Confidence: 76%)
Pattern: Perfect grammar + mechanical structure
AI Words: "delve", "comprehensive", "facilitate"
```
### 4. Specific Issues Detected
```
High-Risk AI Words Found (per 1000 words):
• "delve" (2 occurrences) - RARELY used by humans
• "leverage" (3 occurrences) - Business jargon overuse
• "robust" (4 occurrences) - Technical overuse
• "furthermore" (6 occurrences) - Formal transition overuse
Sentence Uniformity Issues:
• 67% of sentences are 15-25 words (AI sweet spot)
• 82% of paragraphs start with transition words
• Low variation in sentence complexity
Paragraph Structure Issues:
• All paragraphs 4-6 sentences long
• Mechanical topic-sentence pattern throughout
```
### 5. Word Frequency Report
```
Top AI-Typical Words:
1. "furthermore" - 6x (baseline: 0.5x per 1000 words)
2. "robust" - 4x (baseline: 0.8x per 1000 words)
3. "leverage" - 3x (baseline: 0.3x per 1000 words)
4. "comprehensive" - 3x (baseline: 1.2x per 1000 words)
5. "delve" - 2x (baseline: 0.1x per 1000 words)
Comparison to Human Academic Writing:
Your text: 4.2 AI-typical words per 1000
Human baseline: 1.5 AI-typical words per 1000
Ratio: 2.8x higher than human baseline
```
## Improvement Suggestions
### For High Confidence (70-100%) Detections
**Sentence Structure:**
- ❌ "Furthermore, the results demonstrate a comprehensive analysis of the robust dataset."
- ✅ "The results show our analysis covered the full dataset."
**Why Better:** Simpler words, no transition word, more direct
**Word Choice:**
- ❌ "This study delves into the utilization of innovative methodologies."
- ✅ "We examine how researchers use new methods."
**Why Better:** Active voice, common words, clearer meaning
**Paragraph Variation:**
- ❌ All paragraphs 5 sentences, topic sentence + 3 support + conclusion
- ✅ Mix paragraph lengths: 2, 7, 4, 3, 6 sentences based on content needs
**Why Better:** Natural flow based on content, not formula
### Specific Suggestion Categories
#### 1. Vary Sentence Lengths
```
Current: 15-25 word sentences consistently
Suggestion: Mix short (5-10), medium (15-20), long (25-35) sentences
Example:
- Short: "The effect was significant."
- Medium: "We observed a 23% increase across all conditions."
- Long: "This finding aligns with previous work showing that..."
```
#### 2. Replace AI-Typical Words
```
Replace → With
- "delve into" → "examine", "explore", "investigate"
- "leverage" → "use", "apply", "employ"
- "utilize" → "use"
- "robust" → "strong", "reliable", "thorough"
- "facilitate" → "enable", "help", "allow"
- "furthermore" → "also", "next", [or remove]
- "moreover" → "additionally", "also", [or use dash]
- "comprehensive" → "complete", "thorough", "full"
```
#### 3. Add Natural Imperfections (Where Appropriate)
```
- Use contractions in appropriate contexts ("it's", "we'll")
- Include domain-specific jargon naturally
- Allow informal phrasing in methods/procedures
- Use occasional sentence fragments for emphasis
- Add personal observations or interpretations
- Include field-specific colloquialisms
```
#### 4. Break Paragraph Uniformity
```
Current: All paragraphs follow topic-support-support-support-conclusion
Suggestion: Vary based on content
- Use single-sentence paragraphs for emphasis
- Combine related ideas into longer paragraphs
- Don't force every paragraph to have 5 sentences
- Let content determine structure, not formula
```
#### 5. Remove Mechanical Transitions
```
❌ "Furthermore, the results show... Moreover, the analysis reveals..."
✅ "The results show... The analysis also reveals..." [simpler transitions]
✅ "The results show... Looking closer, the analysis..." [natural bridges]
```
## Integration Points
### 1. Pre-Commit Hook Integration
**Automatic checking before git commits**
```bash
# Configured in .claude/settings.json
"gitPreCommit": {
"command": "python3 hooks/pre-commit-ai-check.py",
"enabled": true
}
```
**Behavior:**
- Runs on staged `.md`, `.tex`, `.rst` files
- Warns if confidence 30-70%
- Blocks commit if confidence >70%
- User can override with `git commit --no-verify`
**Exit Codes:**
- 0: Pass (confidence <30%)
- 1: Warning (confidence 30-70%, commit allowed)
- 2: Block (confidence >70%, commit blocked)
### 2. Quality Assurance Integration
**Part of comprehensive QA workflow**
Integrated into `code/quality_assurance/qa_manager.py`:
- Runs during manuscript phase QA validation
- Checks all deliverable documents
- Generates detailed QA report section
- Fails QA if confidence >40%
**Configuration** (`.ai-check-config.yaml`):
```yaml
qa_integration:
enabled: true
max_confidence_threshold: 0.40
check_manuscripts: true
check_documentation: true
generate_detailed_reports: true
```
### 3. Manuscript Writer Agent Integration
**Real-time feedback during writing**
Agent checks writing incrementally:
- After drafting each section
- Before moving to next phase
- Applies suggestions automatically
- Re-checks until confidence <30%
**Agent Workflow:**
1. Draft section
2. Run ai-check skill
3. Review detection results
4. Apply improvement suggestions
5. Re-check until authentic
6. Proceed to next section
### 4. Standalone Skill Usage
**Manual invocation by user or agents**
**User Invocation:**
```
Please run ai-check on docs/manuscript/discussion.tex and provide detailed feedback.
```
**Agent Invocation:**
```
I'll use the ai-check skill to verify this text before proceeding.
```
**CLI Tool:**
```bash
python tools/ai_check.py path/to/file.md
python tools/ai_check.py --directory docs/
python tools/ai_check.py --format html --output report.html
```
## Tracking System
### Historical Tracking
Log all AI-check runs to database for evolution tracking:
**Database Schema** (PostgreSQL via research-database MCP):
```sql
CREATE TABLE ai_check_history (
id SERIAL PRIMARY KEY,
file_path TEXT NOT NULL,
git_commit TEXT,
timestamp TIMESTAMP DEFAULT NOW(),
overall_confidence FLOAT,
grammar_score FLOAT,
sentence_score FLOAT,
paragraph_score FLOAT,
word_score FLOAT,
punctuation_score FLOAT,
ai_words_found JSONB,
flagged_sections JSONB
);
```
### Trend Analysis
Track writing evolution:
```
File: docs/manuscript/discussion.tex
Version History:
2025-01-15: 78% confidence (HIGH - likely AI)
2025-01-18: 52% confidence (MEDIUM - revision 1)
2025-01-20: 34% confidence (LOW-MEDIUM - revision 2)
2025-01-22: 18% confidence (LOW - authentic writing)
Trend: ✅ Improving toward authentic writing
```
**Use Cases:**
- Monitor dissertation chapters over time
- Track improvements after applying suggestions
- Demonstrate writing authenticity to committee
- Identify sections needing more work
## Configuration
### Configuration File: `.ai-check-config.yaml`
```yaml
# AI-Check Skill Configuration
# Pre-Commit Hook Settings
pre_commit:
enabled: true
check_files: [".md", ".tex", ".rst", ".txt"]
check_docstrings: true # Check Python docstrings
block_threshold: 0.70 # Block commit if >= 70%
warn_threshold: 0.30 # Warn if >= 30%
exclude_patterns:
- "*/examples/*"
- "*/tests/*"
- "*/node_modules/*"
- "*/.venv/*"
# Quality Assurance Integration
qa_integration:
enabled: true
max_confidence_threshold: 0.40 # Fail QA if >= 40%
check_manuscripts: true
check_documentation: true
generate_detailed_reports: true
track_history: true
# Detection Parameters
detection:
# Weight each metric (must sum to 1.0)
weights:
grammar_perfection: 0.20
sentence_uniformity: 0.25
paragraph_structure: 0.20
ai_word_frequency: 0.25
punctuation_patterns: 0.10
# AI-typical word lists
ai_words:
high_risk: ["delve", "leverage", "utilize"]
medium_risk: ["robust", "comprehensive", "facilitate"]
transitions: ["furthermore", "moreover", "additionally"]
# Thresholds
ai_words_per_1000_threshold: 3.0
human_baseline_per_1000: 1.5
# Report Generation
reporting:
default_format: "markdown" # markdown, json, html
include_suggestions: true
include_word_frequency: true
include_flagged_sections: true
max_flagged_sections: 10
# Tracking
tracking:
enabled: true
database: "research-database-mcp"
retention_days: 365
```
### Per-Project Overrides
Create `.ai-check.local.yaml` for project-specific settings:
```yaml
# Project-specific overrides
pre_commit:
block_threshold: 0.60 # More lenient for early drafts
detection:
ai_words:
high_risk: ["delve"] # Only flag worst offenders
```
## Examples
### Example 1: High Confidence Detection
**Input Text:**
```
Furthermore, this comprehensive study delves into the robust
methodologies utilized to facilitate the implementation of innovative
approaches. Moreover, the analysis demonstrates significant findings
that leverage state-of-the-art techniques. Subsequently, the results
indicate substantial improvements across all metrics. Nevertheless,
additional research is crucial to fully comprehend the implications.
```
**AI-Check Report:**
```
Overall Confidence: 89% (HIGH - Likely AI-generated)
Issues Detected:
- 8 AI-typical words in 60 words (13.3 per 1000 words!)
- Every sentence starts with transition word
- Uniform sentence length (15-18 words each)
- Perfect grammar, zero natural imperfections
- Mechanical paragraph structure
AI Words Found:
- furthermore, comprehensive, delves, robust
- utilized, facilitate, innovative, leverage
- demonstrates, significant, subsequently, substantial
- nevertheless, crucial, comprehend
Recommendation: Complete rewrite recommended
```
**Suggested Revision:**
```
We examined the methods used in this approach. The analysis shows
clear improvements across metrics. However, more research is needed
to understand the full implications.
(23 words, 12% confidence - much more natural)
```
### Example 2: Medium Confidence Detection
**Input Text:**
```
The experimental design followed standard protocols established in
previous work (Smith et al., 2023). We collected data from 150
participants over six months. Statistical analysis used mixed-effects
models to account for repeated measures. The results showed a
significant main effect of condition (p < 0.001).
```
**AI-Check Report:**
```
Overall Confidence: 35% (MEDIUM - Possible minor AI assistance)
Issues Detected:
- Slightly uniform sentence length (11-15 words)
- One AI-typical word: "significant" (statistical context acceptable)
- Otherwise natural academic writing
Recommendation: Minor revisions optional, writing appears largely authentic
```
### Example 3: Low Confidence (Human Writing)
**Input Text:**
```
OK so here's what we found. The effect was huge - way bigger than
expected. Participants in the experimental group scored 23% higher
on average. This wasn't just statistically significant; it was
practically meaningful.
We're still not sure why. Maybe it's the timing? Could be the
instructions were clearer. Need to run follow-ups.
```
**AI-Check Report:**
```
Overall Confidence: 8% (LOW - Clearly human writing)
Human Writing Indicators:
- Natural sentence variation (4-19 words)
- Informal elements ("OK so", "way bigger")
- Incomplete thoughts and questions
- Natural uncertainty expressions
- Zero AI-typical words
- Authentic voice throughout
Recommendation: Writing is authentic, no changes needed
```
## Best Practices
### For PhD Students
1. **Run Before Advisor Meetings**
- Check chapters before sending to advisor
- Ensure authenticity before committee review
- Track improvements over time
2. **Use During Drafting**
- Check each section after writing
- Apply suggestions immediately
- Develop natural writing habits
3. **Pre-Submission Validation**
- Run on complete manuscripts before journal submission
- Check supplementary materials
- Verify all documentation
### For Research Teams
1. **Establish Team Standards**
- Set agreed-upon confidence thresholds
- Define when to block vs. warn
- Create team-specific word lists
2. **Code Review Integration**
- Check documentation in pull requests
- Validate README files and guides
- Ensure authentic technical writing
3. **Track Team Writing**
- Monitor trends across team members
- Identify systematic issues
- Share improvement strategies
### For Journal Submission
1. **Pre-Submission Checklist**
- [ ] Overall confidence <30%
- [ ] No flagged high-risk sections
- [ ] AI-typical words <2 per 1000 words
- [ ] Natural sentence variation present
- [ ] Authentic academic voice throughout
2. **Demonstrating Authenticity**
- Include AI-check reports in submission materials
- Show writing evolution over time
- Document revision process
## Limitations
### What This Skill Cannot Do
1. **Not 100% Accurate**
- LLMs constantly improving
- Patterns evolve over time
- False positives possible (very formal human writing)
- False negatives possible (heavily edited AI text)
2. **Cannot Detect All AI Usage**
- Well-edited AI text may pass
- Human writing in AI style may be flagged
- Paraphrasing tools may evade detection
- Future models may have different patterns
3. **Domain Limitations**
- Trained primarily on academic writing
- May not work well for creative writing
- Technical jargon may affect scores
- Non-English text not supported
### Use Alongside Human Judgment
**This skill is a tool, not a replacement for human judgment:**
- Use confidence scores as guidance, not absolute truth
- Consider context and field-specific norms
- Combine with plagiarism detection tools
- Maintain academic integrity standards
- Update word lists as AI patterns evolve
## Support
### Troubleshooting
**Problem:** False positive on authentic writing
**Solution:** Check if writing is overly formal. Consider field-specific norms. Adjust thresholds in config.
**Problem:** AI text passing with low confidence
**Solution:** Update AI-typical word lists. Check for heavily edited text. Report patterns for skill updates.
**Problem:** Pre-commit hook too slow
**Solution:** Reduce checked file types. Enable caching. Check only modified sections.
**Problem:** Disagreement with manual review
**Solution:** Generate detailed report. Review flagged sections specifically. Consider multiple metrics not just overall score.
### Getting Help
- **Documentation:** See `docs/skills/ai-check-reference.md`
- **Issues:** https://github.com/astoreyai/ai_scientist/issues
- **Updates:** Check for skill updates regularly as AI patterns evolve
---
**Last Updated:** 2025-11-09
**Version:** 1.0.0
**License:** MIT

55
skills/blinding/SKILL.md Normal file
View File

@@ -0,0 +1,55 @@
---
name: blinding
description: "Implement blinding procedures to reduce bias in experimental studies. Use when: (1) Designing RCTs, (2) Ensuring objectivity, (3) Meeting CONSORT standards, (4) Minimizing performance and detection bias."
allowed-tools: Read, Write
version: 1.0.0
---
# Blinding Procedures Skill
## Purpose
Implement appropriate blinding to reduce bias in research studies.
## Types of Blinding
**Single-Blind**: Participants unaware of allocation
**Double-Blind**: Participants and researchers unaware
**Triple-Blind**: Participants, researchers, and analysts unaware
## Who to Blind
**Participants**: Reduce expectancy effects, placebo
**Interventionists**: Reduce performance bias
**Assessors**: Reduce detection bias
**Analysts**: Reduce reporting bias
## Blinding Strategies
**Medications:**
- Identical placebo (same appearance, taste)
- Over-encapsulation
- Matching packaging
**Behavioral Interventions:**
- Attention-matched control
- Active control condition
- Blind outcome assessors
**Assessments:**
- Automated/computerized measures
- Independent blinded raters
- Objective outcomes (less bias-prone)
## When Blinding Impossible
- Surgical interventions
- Exercise interventions
- Educational programs
**Mitigation:**
- Blind outcome assessors
- Use objective outcomes
- Report lack of blinding as limitation
---
**Version:** 1.0.0

View File

@@ -0,0 +1,324 @@
---
name: citation-format
description: "Format citations and bibliographies in multiple academic styles (APA, IEEE, Chicago, Harvard, MLA, Nature, Science). Use when: (1) Converting between citation styles for different journals, (2) Cleaning and standardizing bibliography entries, (3) Validating citation formatting before submission, (4) Generating properly formatted reference lists, (5) Checking citation consistency across manuscripts."
allowed-tools: Read, Write
version: 1.0.0
---
# Citation Formatting Skill
## Purpose
Format academic citations according to standard style guides. Ensures proper citation formatting for journal submissions, dissertations, and academic publications.
## Supported Citation Styles
### 1. APA 7th Edition
**Use for:** Psychology, education, social sciences
**Journal Article:**
```
Smith, J., & Jones, M. (2023). Title of article. Journal Name, 15(2), 123-145. https://doi.org/10.1000/xyz123
```
**Book:**
```
Author, A. A. (2023). Title of book (2nd ed.). Publisher Name.
```
**Chapter:**
```
Author, A. A. (2023). Chapter title. In B. B. Editor (Ed.), Book title (pp. 45-67). Publisher.
```
### 2. IEEE
**Use for:** Engineering, computer science, technology
**Journal Article:**
```
[1] J. Smith and M. Jones, "Title of article," Journal Name, vol. 15, no. 2, pp. 123-145, 2023.
```
**Conference:**
```
[1] J. Smith, "Paper title," in Proc. Conference Name, City, Country, 2023, pp. 45-52.
```
### 3. Chicago (Author-Date)
**Use for:** History, arts, humanities
**Journal Article:**
```
Smith, John, and Mary Jones. 2023. "Title of Article." Journal Name 15 (2): 123-145.
```
**Book:**
```
Smith, John. 2023. Book Title. City: Publisher Name.
```
### 4. Chicago (Notes-Bibliography)
**Footnote:**
```
1. John Smith and Mary Jones, "Title of Article," Journal Name 15, no. 2 (2023): 123-145.
```
**Bibliography:**
```
Smith, John, and Mary Jones. "Title of Article." Journal Name 15, no. 2 (2023): 123-145.
```
### 5. Harvard
**Use for:** UK universities, various disciplines
```
Smith, J. and Jones, M. (2023) 'Title of article', Journal Name, 15(2), pp. 123-145.
```
### 6. MLA 9th Edition
**Use for:** Literature, languages, humanities
```
Smith, John, and Mary Jones. "Title of Article." Journal Name, vol. 15, no. 2, 2023, pp. 123-145.
```
### 7. Nature
**Use for:** Natural sciences
```
Smith, J. & Jones, M. Title of article. Journal Name 15, 123-145 (2023).
```
### 8. Science
**Use for:** Multidisciplinary sciences
```
J. Smith, M. Jones, Title of article. Journal Name 15, 123 (2023).
```
## When to Use This Skill
1. **Journal Submission** - Format references for target journal
2. **Style Conversion** - Convert between styles (APA → IEEE)
3. **Dissertation Formatting** - Ensure consistency across chapters
4. **Bibliography Cleaning** - Fix formatting errors in .bib files
5. **Citation Validation** - Verify proper formatting before submission
6. **Collaborative Writing** - Standardize citations from multiple authors
## Common Formatting Tasks
### Task 1: Convert Citation Style
**Input:**
```
Style: APA → IEEE
Citation: Smith, J., & Jones, M. (2023). Deep learning. AI Journal, 15(2), 123-145.
```
**Process:**
1. Extract metadata: authors, title, journal, volume, issue, pages, year
2. Apply IEEE template
3. Add numbering
**Output:**
```
[1] J. Smith and M. Jones, "Deep learning," AI Journal, vol. 15, no. 2, pp. 123-145, 2023.
```
### Task 2: Clean BibTeX Entry
**Input (messy):**
```bibtex
@article{smith2023,
author={Smith, John and Jones, Mary and Johnson, Bob},
title={A Really Long Title That Goes On And On},
journal={Journal},
year=2023,
volume=15,
pages={123--145},
doi={10.1000/xyz}
}
```
**Output (cleaned):**
```bibtex
@article{smith2023deep,
author = {Smith, John and Jones, Mary and Johnson, Bob},
title = {A Really Long Title That Goes On and On},
journal = {Journal Name},
year = {2023},
volume = {15},
number = {2},
pages = {123--145},
doi = {10.1000/xyz123}
}
```
### Task 3: Generate Reference List
**Input:** List of DOIs or BibTeX entries
**Output:** Formatted reference list in specified style
## Formatting Rules by Style
### APA 7th Edition Rules
- **Author names:** Last, F. M., & Last, F. M.
- **Year:** (2023)
- **Title:** Sentence case (only first word capitalized)
- **Journal:** Title Case, volume(issue), pages
- **DOI:** https://doi.org/10.xxxx/yyyy
### IEEE Rules
- **Numbering:** [1], [2], [3] in order of appearance
- **Author names:** F. M. Last and F. M. Last
- **Title:** "Title case with quotes"
- **Journal:** Italic Journal, vol. 15, no. 2, pp. 123-145, Month 2023
- **DOI:** doi: 10.xxxx/yyyy
### Chicago Rules
- **Author-Date:** Last, First, and First Last. Year.
- **Notes:** Superscript footnote numbers
- **Bibliography:** Alphabetical by last name
- **Title:** Title Case for Books, "Sentence case" for articles
## Integration with Other Components
### With Citation-Management MCP Server
```
Use citation-management MCP to:
1. Fetch metadata from DOI
2. Verify citations via Crossref
3. Check for retractions
4. Then apply formatting with this skill
```
### With Manuscript-Writer Agent
```
Agent workflow:
1. Collect citations during writing
2. Use citation-format skill to format
3. Generate bibliography
4. Validate all citations before submission
```
### With Bibliography Tools
- **BibTeX:** Parse and reformat .bib files
- **Zotero:** Export and format Zotero libraries
- **Mendeley:** Convert Mendeley exports
## Examples
### Example 1: Multi-Author APA Citation
**Input Data:**
```
Authors: Smith, J., Jones, M., Johnson, B., Williams, K., Brown, L., Davis, R., Miller, T., Wilson, P.
Year: 2023
Title: large-scale meta-analysis of intervention effects
Journal: Psychological Bulletin
Volume: 149
Issue: 3
Pages: 456-489
DOI: 10.1037/bul0000123
```
**APA Format (7+ authors):**
```
Smith, J., Jones, M., Johnson, B., Williams, K., Brown, L., Davis, R., Miller, T., & Wilson, P. (2023). Large-scale meta-analysis of intervention effects. Psychological Bulletin, 149(3), 456-489. https://doi.org/10.1037/bul0000123
```
### Example 2: Conference Proceedings (IEEE)
**Input Data:**
```
Authors: Zhang, L., Kumar, R.
Year: 2024
Title: Neural architecture search using reinforcement learning
Conference: International Conference on Machine Learning
Location: Vienna, Austria
Pages: 1234-1242
```
**IEEE Format:**
```
[1] L. Zhang and R. Kumar, "Neural architecture search using reinforcement learning," in Proc. Int. Conf. Mach. Learn., Vienna, Austria, 2024, pp. 1234-1242.
```
### Example 3: Book Chapter (Chicago)
**Input Data:**
```
Author: Thompson, S.
Year: 2023
Chapter: Qualitative research methods in education
Editors: Anderson, P., Baker, M.
Book: Handbook of Educational Research
Pages: 234-267
Publisher: Academic Press
Location: New York
```
**Chicago (Author-Date):**
```
Thompson, Sarah. 2023. "Qualitative Research Methods in Education." In Handbook of Educational Research, edited by Peter Anderson and Michelle Baker, 234-267. New York: Academic Press.
```
## Validation Checklist
Before finalizing citations, verify:
- [ ] **Author names** formatted correctly for style
- [ ] **Publication year** present and correct
- [ ] **Title** capitalization follows style rules
- [ ] **Journal names** spelled out or abbreviated per style
- [ ] **Volume and issue** numbers formatted correctly
- [ ] **Page ranges** use correct separator (, --, or -)
- [ ] **DOIs** formatted correctly and functional
- [ ] **Punctuation** matches style guide exactly
- [ ] **Hanging indents** applied (APA, MLA, Chicago)
- [ ] **Numbering** sequential and correct (IEEE)
- [ ] **Alphabetical order** correct (most styles)
## Common Errors to Fix
### Error 1: Incorrect Capitalization
❌ "Deep Learning Applications in Medical Imaging" (Title Case in APA)
✅ "Deep learning applications in medical imaging" (Sentence case)
### Error 2: Missing DOI
❌ Citation without DOI
✅ Add: https://doi.org/10.xxxx/yyyy
### Error 3: Inconsistent Formatting
❌ Mixed citation styles in one document
✅ All citations use same style
### Error 4: Incorrect Ampersand Usage
❌ "Smith and Jones" in APA in-text citation
✅ "Smith & Jones" in APA in-text citation
### Error 5: Page Range Separator
❌ "123-145" in Chicago (should be en-dash)
✅ "123145" in Chicago
## Best Practices
1. **Choose Style Early** - Select citation style at project start
2. **Use Reference Manager** - Zotero, Mendeley, or EndNote
3. **Validate DOIs** - Ensure all DOIs resolve correctly
4. **Check Journal Requirements** - Some journals have specific rules
5. **Automate When Possible** - Use tools for consistent formatting
6. **Proofread Carefully** - Citation errors are common in rejections
## Resources
- **APA Style:** https://apastyle.apa.org/
- **IEEE Reference Guide:** https://ieee-dataport.org/sites/default/files/analysis/27/IEEE%20Citation%20Guidelines.pdf
- **Chicago Manual of Style:** https://www.chicagomanualofstyle.org/
- **MLA Handbook:** https://style.mla.org/
- **Purdue OWL:** https://owl.purdue.edu/owl/research_and_citation/
---
**Last Updated:** 2025-11-09
**Version:** 1.0.0

View File

@@ -0,0 +1,52 @@
---
name: data-visualization
description: "Create publication-quality data visualizations. Use when: (1) Presenting results, (2) Exploratory data analysis, (3) Manuscript preparation, (4) Grant proposals, (5) Presentations."
allowed-tools: Read, Write, Bash
version: 1.0.0
---
# Data Visualization Skill
## Purpose
Create clear, publication-ready figures following best practices.
## Figure Types by Data
**Continuous Outcome:**
- Bar plots (means with error bars)
- Box plots (distributions)
- Violin plots (density)
**Categorical Data:**
- Bar charts (counts/proportions)
- Stacked bars (composition)
**Relationships:**
- Scatter plots (correlations)
- Line graphs (time series)
**Distributions:**
- Histograms
- Density plots
- Q-Q plots (normality)
## Best Practices
**Design:**
- Clear axis labels with units
- Legible font sizes (≥10pt)
- Color-blind friendly palettes
- Minimal chart junk
**Statistics:**
- Show individual data points when N < 50
- Error bars: 95% CI (not SE or SD)
- Asterisks for significance with p-value in caption
**Format:**
- High resolution (≥300 DPI)
- Vector format (PDF, SVG) preferred
- Grayscale-compatible
---
**Version:** 1.0.0

View File

@@ -0,0 +1,76 @@
---
name: effect-size
description: "Calculate and interpret effect sizes for statistical analyses. Use when: (1) Reporting research results to show practical significance, (2) Meta-analysis to combine study results, (3) Grant writing to justify expected effects, (4) Interpreting published studies beyond p-values, (5) Sample size planning for power analysis."
allowed-tools: Read, Write
version: 1.0.0
---
# Effect Size Calculation Skill
## Purpose
Calculate standardized effect sizes to quantify the magnitude of research findings. Essential for reporting practical significance beyond p-values.
## Common Effect Size Measures
### Cohen's d (Mean Differences)
**Use:** T-tests, group comparisons on continuous outcomes
```
d = (M₁ - M₂) / SD_pooled
Interpretation:
- Small: d = 0.2
- Medium: d = 0.5
- Large: d = 0.8
```
### Pearson's r (Correlations)
**Interpretation:**
- Small: r = 0.10
- Medium: r = 0.30
- Large: r = 0.50
### Eta-squared (η²) and Partial Eta-squared (η²ₚ)
**Use:** ANOVA, variance explained
```
η² = SS_effect / SS_total
η²ₚ = SS_effect / (SS_effect + SS_error)
Interpretation:
- Small: η² = 0.01
- Medium: η² = 0.06
- Large: η² = 0.14
```
### Odds Ratio (OR) and Risk Ratio (RR)
**Use:** Binary outcomes, clinical trials
```
OR = (a/b) / (c/d) [from 2x2 table]
Interpretation:
- OR = 1: No effect
- OR > 1: Increased odds
- OR < 1: Decreased odds
```
## Always Report with Confidence Intervals
```
Example: d = 0.52, 95% CI [0.28, 0.76]
This shows:
- Best estimate: d = 0.52 (medium effect)
- Precision: CI width suggests adequate sample size
- Excludes zero: Effect is statistically significant
```
## Integration
Use with power-analysis skill for study planning and with statistical analysis for results reporting.
---
**Version:** 1.0.0

View File

@@ -0,0 +1,41 @@
---
name: experiment-design
description: "Design rigorous experiments following best practices. Use when: (1) Planning research studies, (2) Grant proposal development, (3) Pre-registration, (4) Ensuring internal validity, (5) Meeting NIH rigor standards."
allowed-tools: Read, Write
version: 1.0.0
---
# Experiment Design Skill
## Purpose
Design methodologically rigorous experiments with appropriate controls and randomization.
## Key Design Elements
**1. Research Question**: Clear, testable hypothesis
**2. Study Design**: RCT, quasi-experimental, observational
**3. Sample Size**: Power analysis justified
**4. Randomization**: Method specified
**5. Blinding**: Who is blinded
**6. Controls**: Appropriate comparison groups
**7. Outcomes**: Primary and secondary clearly defined
**8. Analysis Plan**: Pre-specified statistical approach
## Common Designs
**Between-Subjects**: Different participants per condition
**Within-Subjects**: Same participants, repeated measures
**Factorial**: Multiple factors (2x2, 2x3)
**Crossover**: Participants receive all treatments
**Stepped-Wedge**: Phased rollout
## NIH Rigor Checklist
- [ ] Scientific premise established
- [ ] Rigorous design (appropriate controls)
- [ ] Biological variables considered (SABV)
- [ ] Authentication of key resources
- [ ] Transparent reporting planned
---
**Version:** 1.0.0

View File

@@ -0,0 +1,114 @@
---
name: hypothesis-test
description: "Guide selection and interpretation of statistical hypothesis tests. Use when: (1) Choosing appropriate test for research data, (2) Checking assumptions before analysis, (3) Interpreting test results correctly, (4) Reporting statistical findings, (5) Troubleshooting assumption violations."
allowed-tools: Read, Write
version: 1.0.0
---
# Hypothesis Testing Skill
## Purpose
Guide appropriate selection and interpretation of statistical hypothesis tests for research data analysis.
## Test Selection Decision Tree
### Step 1: How many variables?
**One variable:**
- Categorical → Chi-square goodness of fit
- Continuous → One-sample t-test
**Two variables:**
- Both categorical → Chi-square test of independence
- One categorical, one continuous → T-test or ANOVA
- Both continuous → Correlation or regression
**Three+ variables:**
- Multiple predictors → Multiple regression or ANOVA
- Complex designs → Mixed models or advanced methods
### Step 2: Check assumptions
**For t-tests:**
1. Independence of observations
2. Normality (especially for small N)
3. Homogeneity of variance
**Violations?**
- Non-normal → Mann-Whitney U (non-parametric)
- Unequal variance → Welch's t-test
- Dependent observations → Paired t-test or mixed models
**For ANOVA:**
1. Independence
2. Normality
3. Homogeneity of variance
4. No outliers
**Violations?**
- Non-normal → Kruskal-Wallis test
- Unequal variance → Welch's ANOVA
- Outliers → Robust methods or transformation
### Step 3: Interpret results
Always report:
1. **Test statistic** (t, F, χ²)
2. **Degrees of freedom**
3. **p-value**
4. **Effect size with CI**
5. **Descriptive statistics**
**Example:**
```
Independent samples t-test showed a significant difference between
groups, t(98) = 3.45, p < .001, d = 0.69, 95% CI [0.29, 1.09].
The experimental group (M = 45.2, SD = 8.3) scored higher than
control (M = 37.8, SD = 9.1).
```
## Common Tests Reference
| Research Question | Test | Assumptions |
|------------------|------|-------------|
| 2 groups, continuous outcome | Independent t-test | Normality, equal variance |
| 2 measurements, same people | Paired t-test | Normality of differences |
| 3+ groups, one factor | One-way ANOVA | Normality, homogeneity |
| 3+ groups, multiple factors | Factorial ANOVA | Normality, homogeneity |
| Relationship between variables | Pearson correlation | Linearity, normality |
| Predict continuous outcome | Linear regression | Linearity, normality of residuals |
| 2 categorical variables | Chi-square test | Expected frequencies ≥5 |
| Ordinal data, 2 groups | Mann-Whitney U | None (non-parametric) |
| Ordinal data, paired | Wilcoxon signed-rank | None (non-parametric) |
## Assumption Checking
### Normality
```
Visual: Q-Q plot, histogram
Statistical: Shapiro-Wilk test (N < 50), Kolmogorov-Smirnov (N ≥ 50)
Guideline: Robust to moderate violations if N ≥ 30
```
### Homogeneity of Variance
```
Visual: Box plots, residual plots
Statistical: Levene's test, Bartlett's test
Guideline: Ratio of largest/smallest variance < 4
```
### Independence
```
Check: Research design, data collection
Red flags: Time series, clustered data, repeated measures
Solution: Use appropriate model (mixed effects, GEE)
```
## Integration
Use with data-analyst agent for complete statistical analysis workflow and experiment-designer agent for planning appropriate analyses.
---
**Version:** 1.0.0

View File

@@ -0,0 +1,50 @@
---
name: inclusion-criteria
description: "Apply inclusion/exclusion criteria systematically in literature reviews. Use when: (1) Screening abstracts, (2) Reviewing full texts, (3) Documenting screening decisions, (4) Ensuring PRISMA compliance."
allowed-tools: Read, Write
version: 1.0.0
---
# Inclusion Criteria Application Skill
## Purpose
Systematically apply eligibility criteria during literature screening.
## Criteria Components
**PICOS Framework:**
- **P**opulation: Who?
- **I**ntervention: What?
- **C**omparison: Compared to?
- **O**utcome: Measuring?
- **S**tudy design: How?
## Example Criteria
**Inclusion:**
- Adults (18+ years)
- Randomized controlled trials
- Published in English
- Depression as primary outcome
- Published 2010-2024
**Exclusion:**
- Non-human studies
- Case reports/case series
- No control group
- Depression as secondary outcome only
## Decision Making
**Title/Abstract Screening:**
- Liberal inclusion (when in doubt, include)
- Quick decisions
- High sensitivity
**Full-Text Review:**
- Strict application of criteria
- Document reasons for exclusion
- Record in PRISMA flow diagram
---
**Version:** 1.0.0

View File

@@ -0,0 +1,58 @@
---
name: irb-protocol
description: "Develop IRB/ethics protocols for human subjects research. Use when: (1) Planning studies involving humans, (2) Preparing IRB applications, (3) Ensuring ethical compliance, (4) Addressing informed consent."
allowed-tools: Read, Write
version: 1.0.0
---
# IRB Protocol Development Skill
## Purpose
Create comprehensive ethical protocols for institutional review board approval.
## Key IRB Components
**1. Study Purpose**
- Research question
- Scientific justification
- Expected benefits
**2. Study Design**
- Methodology overview
- Sample size justification
- Duration and timeline
**3. Participant Selection**
- Inclusion/exclusion criteria
- Recruitment methods
- Vulnerable populations addressed
**4. Risks and Benefits**
- Potential risks identified
- Risk mitigation strategies
- Benefits to participants/society
- Risk-benefit analysis
**5. Informed Consent**
- Voluntary participation
- Right to withdraw
- Clear language (8th grade level)
- Comprehension assessment
**6. Privacy and Confidentiality**
- Data protection measures
- Anonymization/de-identification
- Data storage and security
- Data sharing plans
**7. Compensation**
- Payment amount and schedule
- Justified as not coercive
## Risk Categories
**Minimal Risk**: No greater than daily life
**More than Minimal**: Requires full board review
---
**Version:** 1.0.0

View File

@@ -0,0 +1,63 @@
---
name: literature-gap
description: "Identify research gaps from systematic literature reviews. Use when: (1) Completing literature reviews, (2) Justifying new studies, (3) Grant proposal development, (4) Dissertation planning, (5) Identifying future research directions."
allowed-tools: Read, Grep
version: 1.0.0
---
# Literature Gap Identification Skill
## Purpose
Systematically identify and prioritize research gaps from literature synthesis.
## Types of Research Gaps
**1. Knowledge Gaps**
- Phenomenon not yet studied
- Understudied populations
- Unexplored contexts
**2. Methodological Gaps**
- Lack of rigorous designs (RCTs)
- Limited longitudinal studies
- Need for mixed methods
**3. Theoretical Gaps**
- Competing theories not tested
- Mechanisms not understood
- Mediators/moderators unexplored
**4. Practice Gaps**
- Interventions not tested
- Implementation not studied
- Scalability unknown
**5. Evidence Quality Gaps**
- High risk of bias in existing studies
- Small sample sizes
- Inconsistent results needing resolution
## Gap Analysis Process
1. **Synthesize Findings**: What do we know?
2. **Identify Limitations**: What are the weaknesses?
3. **Find Patterns**: What's consistently missing?
4. **Prioritize**: Which gaps are most important?
5. **Justify**: Why does this gap matter?
## Example
**Research Area:** Mindfulness interventions for anxiety
**Gaps Identified:**
1. Few studies in adolescent populations (knowledge gap)
2. Lack of active control comparisons (methodological gap)
3. Mechanisms of action unclear (theoretical gap)
4. No implementation studies in schools (practice gap)
5. High attrition rates not addressed (evidence quality gap)
**Prioritized Gap:** Adolescent populations with active controls
- Feasible, high impact, fills critical knowledge void
---
**Version:** 1.0.0

View File

@@ -0,0 +1,51 @@
---
name: meta-analysis
description: "Conduct quantitative synthesis through meta-analysis. Use when: (1) Combining effect sizes across studies, (2) Systematic review synthesis, (3) Calculating summary effects, (4) Assessing heterogeneity."
allowed-tools: Read, Write, Bash
version: 1.0.0
---
# Meta-Analysis Skill
## Purpose
Quantitatively synthesize results across multiple studies.
## Meta-Analysis Steps
**1. Extract Effect Sizes**
- Convert to common metric (d, OR, RR)
- Calculate standard errors
**2. Choose Model**
- Fixed-effect: Assumes single true effect
- Random-effects: Allows heterogeneity
**3. Pool Results**
- Weight studies (inverse variance)
- Calculate summary effect
- 95% confidence interval
**4. Assess Heterogeneity**
- I² statistic (0-100%)
- 0-40%: Low heterogeneity
- 40-75%: Moderate
- 75-100%: High
- Q test (statistical significance)
**5. Investigate Heterogeneity**
- Subgroup analysis
- Meta-regression
- Sensitivity analysis
**6. Publication Bias**
- Funnel plot
- Egger's test
- Trim-and-fill
## Reporting
**Example:**
"Meta-analysis of 15 RCTs (N=1,234) showed a moderate effect, g=0.52, 95% CI[0.38, 0.66], p<.001. Heterogeneity was moderate, I²=58%, suggesting variability in effects."
---
**Version:** 1.0.0

View File

@@ -0,0 +1,480 @@
---
name: power-analysis
description: "Calculate statistical power and required sample sizes for research studies. Use when: (1) Designing experiments to determine sample size, (2) Justifying sample size for grant proposals or protocols, (3) Evaluating adequacy of existing studies, (4) Meeting NIH rigor standards for pre-registration, (5) Conducting retrospective power analysis to interpret null results."
allowed-tools: Read, Write
version: 1.0.0
---
# Statistical Power Analysis Skill
## Purpose
Calculate statistical power and determine required sample sizes for research studies. Essential for experimental design, grant writing, and meeting NIH rigor and reproducibility standards.
## Core Concepts
### Statistical Power
**Definition:** Probability of detecting a true effect when it exists (1 - β)
**Standard:** Power ≥ 0.80 (80%) is typically required for NIH grants and pre-registration
### Key Parameters
1. **Effect Size (d, r, η²)** - Magnitude of the phenomenon
2. **Alpha (α)** - Type I error rate (typically 0.05)
3. **Power (1-β)** - Probability of detecting effect (typically 0.80)
4. **Sample Size (N)** - Number of participants/observations needed
### The Relationship
```
Power = f(Effect Size, Sample Size, Alpha, Test Type)
For given effect size and alpha:
↑ Sample Size → ↑ Power
↑ Effect Size → ↓ Sample Size needed
```
## When to Use This Skill
### Pre-Study (Prospective Power Analysis)
1. **Grant Proposals** - Justify requested sample size
2. **Study Design** - Determine recruitment needs
3. **Pre-Registration** - Document planned sample size with justification
4. **Resource Planning** - Estimate time and cost requirements
5. **Ethical Review** - Minimize participants while maintaining power
### Post-Study (Retrospective/Sensitivity Analysis)
1. **Null Results** - Was study adequately powered?
2. **Publication** - Report achieved power
3. **Meta-Analysis** - Assess individual study adequacy
4. **Study Critique** - Evaluate power of published work
## Common Study Designs
### 1. Independent Samples T-Test
**Use:** Compare two independent groups
**Formula:**
```
N per group = 2 * (z_α/2 + z_β)² * σ² / d²
Where:
- d = effect size (Cohen's d)
- α = significance level (typ. 0.05)
- β = Type II error (1 - power)
- σ² = pooled variance
```
**Example:**
```
Research Question: Does intervention improve test scores vs. control?
Effect Size: d = 0.5 (medium effect)
Alpha: 0.05
Power: 0.80
Result: N = 64 per group (128 total)
```
**Effect Size Guidelines (Cohen's d):**
- Small: d = 0.2
- Medium: d = 0.5
- Large: d = 0.8
### 2. Paired Samples T-Test
**Use:** Pre-post comparisons, matched pairs
**Formula:**
```
N = (z_α/2 + z_β)² * 2(1-ρ) / d²
Where ρ = correlation between measures
```
**Example:**
```
Research Question: Does training improve performance (pre-post)?
Effect Size: d = 0.6
Correlation: ρ = 0.5 (moderate test-retest reliability)
Alpha: 0.05
Power: 0.80
Result: N = 24 participants
```
**Key Insight:** Higher correlation → fewer participants needed
### 3. One-Way ANOVA
**Use:** Compare 3+ independent groups
**Formula:**
```
N per group = (k * (z_α/2 + z_β)²) / f²
Where:
- k = number of groups
- f = effect size (Cohen's f)
```
**Example:**
```
Research Question: Compare 4 treatment conditions
Effect Size: f = 0.25 (medium)
Alpha: 0.05
Power: 0.80
Result: N = 45 per group (180 total)
```
**Effect Size Guidelines (Cohen's f):**
- Small: f = 0.10
- Medium: f = 0.25
- Large: f = 0.40
### 4. Chi-Square Test
**Use:** Association between categorical variables
**Formula:**
```
N = (z_α/2 + z_β)² / (w² * df)
Where:
- w = effect size (Cohen's w)
- df = degrees of freedom
```
**Example:**
```
Research Question: Is treatment success related to group (2x2 table)?
Effect Size: w = 0.3 (medium)
Alpha: 0.05
Power: 0.80
df = 1
Result: N = 88 total participants
```
**Effect Size Guidelines (Cohen's w):**
- Small: w = 0.10
- Medium: w = 0.30
- Large: w = 0.50
### 5. Correlation
**Use:** Relationship between continuous variables
**Formula:**
```
N = (z_α/2 + z_β)² / C(r)² + 3
Where C(r) = 0.5 * ln((1+r)/(1-r)) [Fisher's z]
```
**Example:**
```
Research Question: Correlation between anxiety and performance
Expected r: 0.30
Alpha: 0.05
Power: 0.80
Result: N = 84 participants
```
**Effect Size Guidelines (Pearson's r):**
- Small: r = 0.10
- Medium: r = 0.30
- Large: r = 0.50
### 6. Multiple Regression
**Use:** Predict outcome from multiple predictors
**Formula:**
```
N = (L / f²) + k + 1
Where:
- L = non-centrality parameter
- f² = effect size (Cohen's f²)
- k = number of predictors
```
**Example:**
```
Research Question: Predict depression from 5 variables
Effect Size: f² = 0.15 (medium)
Alpha: 0.05
Power: 0.80
Predictors: 5
Result: N = 92 participants
```
**Effect Size Guidelines (f²):**
- Small: f² = 0.02
- Medium: f² = 0.15
- Large: f² = 0.35
## Choosing Effect Sizes
### Method 1: Previous Literature
**Best Practice:** Use meta-analytic estimates
```
Example: Meta-analysis shows d = 0.45 for CBT vs. control
Use: d = 0.45 for power calculation
```
### Method 2: Pilot Study
```
Run small pilot (N = 20-30)
Calculate observed effect size
Adjust for uncertainty (use 80% of observed)
```
### Method 3: Minimum Meaningful Effect
```
Question: What's the smallest effect worth detecting?
Clinical: Minimum clinically important difference (MCID)
Practical: Cost-benefit threshold
```
### Method 4: Cohen's Conventions
**Use only when no better information available:**
- Small effects: Often require N > 300
- Medium effects: Typically N = 50-100 per group
- Large effects: May need only N = 20-30 per group
**Warning:** Cohen's conventions are rough guidelines, not universal truths!
## NIH Rigor Standards
### Required Elements for NIH Grants
1. **Justified Effect Size**
- Cite source (literature, pilot, theory)
- Explain why this effect size is reasonable
- Consider range of plausible values
2. **Power Calculation**
- Show formula or software used
- Report all assumptions
- Calculate required N
3. **Sensitivity Analysis**
- Show power across range of effect sizes
- Demonstrate study is adequately powered
4. **Accounting for Attrition**
```
N_recruit = N_required / (1 - expected_attrition_rate)
Example:
N_required = 100
Expected attrition = 20%
N_recruit = 100 / 0.80 = 125
```
### Sample Power Analysis Section (Grant)
```
Sample Size Justification
We will recruit 140 participants (70 per group) to detect a medium
effect (d = 0.50) with 80% power at α = 0.05 using an independent
samples t-test. This effect size is based on our pilot study (N = 30)
which showed d = 0.62, and is consistent with meta-analytic estimates
for similar interventions (Smith et al., 2023; d = 0.48, 95% CI
[0.35, 0.61]).
We calculated the required sample size using G*Power 3.1, assuming
a two-tailed test. To account for anticipated 20% attrition based on
previous studies in this population, we will recruit 175 participants
(rounded to 180 for equal groups: 90 per condition).
Sensitivity analysis shows our design provides:
- 95% power to detect d = 0.60 (large effect)
- 80% power to detect d = 0.50 (medium effect)
- 55% power to detect d = 0.40 (small-medium effect)
This ensures adequate power while minimizing participant burden and
research costs.
```
## Tools and Software
### G*Power (Free)
- **Use:** Most common power analysis tool
- **Pros:** User-friendly, comprehensive test coverage
- **Cons:** Desktop only, manual calculations
- **Download:** https://www.psychologie.hhu.de/arbeitsgruppen/allgemeine-psychologie-und-arbeitspsychologie/gpower
### R (pwr package)
```R
library(pwr)
# Independent t-test
pwr.t.test(d = 0.5, power = 0.80, sig.level = 0.05, type = "two.sample")
# ANOVA
pwr.anova.test(k = 4, f = 0.25, power = 0.80, sig.level = 0.05)
# Correlation
pwr.r.test(r = 0.30, power = 0.80, sig.level = 0.05)
```
### Python (statsmodels)
```python
from statsmodels.stats.power import ttest_power
# Calculate power
power = ttest_power(effect_size=0.5, nobs=64, alpha=0.05)
# Calculate required N
from statsmodels.stats.power import tt_ind_solve_power
n = tt_ind_solve_power(effect_size=0.5, power=0.8, alpha=0.05)
```
### Online Calculators
- **PASS:** https://www.ncss.com/software/pass/
- **PS:** https://sph.umich.edu/biostat/power-and-sample-size.html
## Common Pitfalls
### Pitfall 1: Using Post-Hoc Power
❌ **Wrong:** Calculate power after null result using observed effect
**Problem:** Will always show low power for null results (circular reasoning)
✅ **Right:** Report sensitivity analysis (what effects were you powered to detect?)
### Pitfall 2: Underpowered Studies
❌ **Wrong:** "We had N=20, but it was a pilot study"
**Problem:** Even pilots should have defensible sample sizes
✅ **Right:** Use sequential design, specify stopping rules, or acknowledge limitation
### Pitfall 3: Overpowered Studies
❌ **Wrong:** N = 1000 to detect tiny effect (d = 0.1)
**Problem:** Wastes resources, detects trivial effects
✅ **Right:** Power for smallest meaningful effect, not just statistical significance
### Pitfall 4: Ignoring Multiple Comparisons
❌ **Wrong:** Power calculation for single test, but running 10 tests
**Problem:** Actual power much lower due to multiple testing correction
✅ **Right:** Adjust alpha or specify primary vs. secondary outcomes
### Pitfall 5: Wrong Test Type
❌ **Wrong:** Power for independent t-test, but actually paired
**Problem:** Wrong sample size (paired designs often need fewer)
✅ **Right:** Match power calculation to actual analysis plan
## Reporting Power Analysis
### In Pre-Registration
```
Sample Size Determination:
- Statistical test: Independent samples t-test (two-tailed)
- Effect size: d = 0.50 (based on [citation])
- Alpha: 0.05
- Power: 0.80
- Required N: 64 per group (128 total)
- Accounting for 15% attrition: 150 recruited (75 per group)
- Power analysis conducted using G*Power 3.1.9.7
```
### In Manuscript Methods
```
We determined our sample size a priori using power analysis. To detect
a medium effect (Cohen's d = 0.50) with 80% power at α = 0.05
(two-tailed), we required 64 participants per group (G*Power 3.1).
Accounting for anticipated 15% attrition, we recruited 150 participants.
```
### In Results (for Null Findings)
```
Although we found no significant effect (p = .23), sensitivity analysis
shows our study had 80% power to detect effects of d ≥ 0.50 and 55%
power for d = 0.40. Our 95% CI for the effect size was d = -0.12 to 0.38,
excluding large effects but not ruling out small-to-medium effects.
```
## Integration with Research Workflow
### With Experiment-Designer Agent
```
Agent uses power-analysis skill to:
1. Calculate required sample size
2. Justify N in protocol
3. Generate power analysis section for pre-registration
4. Create sensitivity analysis plots
```
### With NIH-Validator
```
Validator checks:
- Power ≥ 80%
- Effect size justified with citation
- Attrition accounted for
- Analysis plan matches power calculation
```
### With Statistical Analysis
```
After analysis:
1. Report achieved power or sensitivity analysis
2. Calculate confidence intervals around effect size
3. Interpret results in context of statistical power
```
## Examples
### Example 1: RCT Power Analysis
```
Design: Randomized controlled trial, two groups
Outcome: Depression scores (continuous)
Analysis: Independent samples t-test
Literature: Meta-analysis shows d = 0.55 for CBT vs. waitlist
Conservative estimate: d = 0.50
Alpha: 0.05 (two-tailed)
Power: 0.80
Calculation (G*Power):
Input: t-test, independent samples, d = 0.5, α = 0.05, power = 0.80
Output: N = 64 per group (128 total)
With 20% attrition: 160 recruited (80 per group)
```
### Example 2: Within-Subjects Design
```
Design: Pre-post intervention
Outcome: Anxiety scores
Analysis: Paired t-test
Pilot data: d = 0.70, r = 0.60
Alpha: 0.05
Power: 0.80
Calculation (considering correlation):
N = 19 participants
With 15% attrition: 23 recruited
```
### Example 3: Factorial ANOVA
```
Design: 2x3 factorial (Treatment x Time)
Analysis: Two-way ANOVA
Effect of interest: Interaction (Treatment × Time)
Effect size: f = 0.25 (medium)
Alpha: 0.05
Power: 0.80
Calculation:
N = 50 per cell (300 total for 6 cells)
With 10% attrition: 330 recruited (55 per cell)
```
---
**Last Updated:** 2025-11-09
**Version:** 1.0.0

View File

@@ -0,0 +1,59 @@
---
name: pre-registration
description: "Create pre-registration documents for research transparency. Use when: (1) Before data collection, (2) Grant submissions, (3) Ensuring reproducibility, (4) Meeting open science standards, (5) Preventing HARKing."
allowed-tools: Read, Write
version: 1.0.0
---
# Pre-Registration Skill
## Purpose
Document study plans before data collection to ensure transparency and prevent questionable research practices.
## Pre-Registration Components
**1. Study Information**
- Title, authors, institutions
- Funding sources
**2. Hypotheses**
- Primary hypothesis (pre-specified)
- Secondary hypotheses
- Exploratory questions
**3. Design**
- Study type (observational, experimental)
- Blinding (who, how)
- Randomization details
**4. Sampling**
- Sample size (with power analysis)
- Recruitment strategy
- Inclusion/exclusion criteria
- Stopping rules
**5. Variables**
- Independent variables (manipulation)
- Dependent variables (outcomes)
- Control variables
- Measurement instruments
**6. Analysis Plan**
- Primary statistical test
- Handling missing data
- Outlier treatment
- Multiple comparison corrections
**7. Timeline**
- Start date
- Data collection period
- Analysis completion
## Platforms
- **OSF (Open Science Framework)**: osf.io
- **AsPredicted**: aspredicted.org
- **ClinicalTrials.gov**: For clinical trials
---
**Version:** 1.0.0

View File

@@ -0,0 +1,91 @@
---
name: prisma-diagram
description: "Generate PRISMA 2020 flow diagrams for systematic reviews. Use when: (1) Conducting systematic literature reviews, (2) Documenting screening process, (3) Reporting study selection for publications, (4) Demonstrating PRISMA compliance, (5) Creating transparent review methodology documentation."
allowed-tools: Read, Write
version: 1.0.0
---
# PRISMA Flow Diagram Skill
## Purpose
Create PRISMA 2020-compliant flow diagrams showing the study selection process in systematic reviews.
## PRISMA 2020 Flow Diagram Structure
```
┌─────────────────────────────────────────┐
│ Identification │
├─────────────────────────────────────────┤
│ Records identified from: │
│ • Databases (n = X) │
│ • Registers (n = X) │
│ • Other sources (n = X) │
│ │
│ Records removed before screening: │
│ • Duplicate records (n = X) │
│ • Records marked ineligible (n = X) │
│ • Records removed for other reasons │
│ (n = X) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Screening │
├─────────────────────────────────────────┤
│ Records screened (n = X) │
│ Records excluded (n = X) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Reports sought for retrieval (n = X) │
│ Reports not retrieved (n = X) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Reports assessed for eligibility │
│ (n = X) │
│ │
│ Reports excluded: (n = X) │
│ • Reason 1 (n = X) │
│ • Reason 2 (n = X) │
│ • Reason 3 (n = X) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Included │
├─────────────────────────────────────────┤
│ Studies included in review (n = X) │
│ Reports of included studies (n = X) │
└─────────────────────────────────────────┘
```
## Required Data Points
1. **Identification:**
- Records from each database
- Duplicates removed
- Records marked ineligible
2. **Screening:**
- Total records screened
- Records excluded at title/abstract
3. **Eligibility:**
- Full-text articles assessed
- Exclusion reasons with counts
4. **Included:**
- Final number of studies
- Final number of reports
## Usage
Provide counts from your literature search and screening process. The skill generates a properly formatted PRISMA 2020 flow diagram in markdown or visual format.
## Integration
Use with literature-reviewer agent and research-database MCP server to automatically populate counts from screening data.
---
**Version:** 1.0.0

View File

@@ -0,0 +1,92 @@
---
name: publication-prep
description: "Prepare manuscripts for journal submission. Use when: (1) Writing research papers, (2) Selecting target journals, (3) Formatting manuscripts, (4) Ensuring reporting guideline compliance, (5) Preparing submission materials."
allowed-tools: Read, Write, Grep
version: 1.0.0
---
# Publication Preparation Skill
## Purpose
Prepare complete, submission-ready manuscripts following journal requirements and reporting guidelines.
## Manuscript Components
**1. Title Page**
- Informative title (<20 words)
- Author names, affiliations
- Corresponding author contact
- Word count
- Funding, conflicts of interest
**2. Abstract**
- Structured (Background, Methods, Results, Conclusions)
- Within word limit (typically 250-300)
- Keywords (3-6 terms)
**3. Introduction**
- Background and rationale
- Literature gap
- Study objectives/hypotheses
**4. Methods**
- Study design
- Participants
- Procedures
- Measures
- Statistical analysis
**5. Results**
- Participant flow
- Descriptive statistics
- Primary outcomes
- Secondary outcomes
- Effect sizes with CI
**6. Discussion**
- Summary of findings
- Comparison to literature
- Limitations
- Implications
- Conclusions
**7. References**
- Formatted per journal style
- Complete and accurate
**8. Tables and Figures**
- Self-contained
- Publication quality
- Referred to in text
## Reporting Guidelines
- **RCTs**: CONSORT checklist
- **Systematic Reviews**: PRISMA checklist
- **Observational**: STROBE checklist
- **Diagnostic**: STARD checklist
## Journal Selection
**Consider:**
- Scope and fit
- Impact factor
- Open access vs subscription
- Review speed
- Acceptance rate
## Submission Checklist
- [ ] Manuscript formatted per journal guidelines
- [ ] Reporting checklist completed
- [ ] All co-authors approved
- [ ] Ethics approval documented
- [ ] Data availability statement
- [ ] Funding disclosed
- [ ] Conflicts of interest declared
- [ ] Cover letter written
- [ ] Suggested reviewers (if allowed)
- [ ] Supplementary materials prepared
---
**Version:** 1.0.0

View File

@@ -0,0 +1,53 @@
---
name: randomization
description: "Implement proper randomization procedures for experiments. Use when: (1) Assigning participants to conditions, (2) Ensuring unbiased allocation, (3) Meeting CONSORT standards, (4) Pre-registration."
allowed-tools: Read, Write, Bash
version: 1.0.0
---
# Randomization Skill
## Purpose
Implement proper random assignment to minimize selection bias.
## Randomization Methods
**1. Simple Randomization**
- Coin flip, random number generator
- Best for large samples (N>200)
- Risk of imbalance in small samples
**2. Block Randomization**
- Ensures equal group sizes
- Blocks of 4, 6, or 8
- Example: AABB, ABAB, BABA, BBAA
**3. Stratified Randomization**
- Balance prognostic factors
- Stratify by sex, age group, severity
- Then randomize within strata
**4. Minimization**
- Dynamic allocation
- Minimizes imbalance across factors
- Used in small trials
## Implementation
**Steps:**
1. Generate random sequence (with seed)
2. Document sequence generation
3. Implement allocation concealment
4. Execute randomization
5. Document actual allocation
**Example (Python):**
```python
import random
random.seed(12345) # Document seed
sequence = ['A', 'B'] * 50 # 100 participants
random.shuffle(sequence)
```
---
**Version:** 1.0.0

View File

@@ -0,0 +1,58 @@
---
name: research-questions
description: "Formulate research questions using FINER criteria (Feasible, Interesting, Novel, Ethical, Relevant). Use when: (1) Starting new research projects, (2) Refining study scope, (3) Grant proposal development, (4) Ensuring research question quality."
allowed-tools: Read, Write
version: 1.0.0
---
# Research Question Formulation Skill
## Purpose
Develop high-quality research questions using FINER criteria.
## FINER Criteria
**F - Feasible**
- Adequate participants available
- Sufficient technical expertise
- Affordable in time and money
- Manageable in scope
**I - Interesting**
- Interesting to investigator and community
- Relevant to scientific knowledge
- Relevant to clinical/policy importance
**N - Novel**
- Confirms/refutes previous findings
- Extends previous findings
- Provides new findings
**E - Ethical**
- IRB approvable
- Benefits outweigh risks
- Equipoise exists
**R - Relevant**
- Advances scientific knowledge
- Influences clinical practice or policy
- Informs future research
## Examples
**Poor Question:**
"Does stress affect health?"
- Too broad, not specific
**Better Question:**
"Does chronic work-related stress increase cardiovascular disease risk in middle-aged adults?"
- Specific population, exposure, outcome
**Best Question:**
"Among adults aged 40-60 with high job strain (Karasek Job Content Questionnaire score >24), does a 12-week mindfulness-based stress reduction intervention reduce systolic blood pressure compared to usual care?"
- PICO format: Population, Intervention, Comparison, Outcome
- Measurable variables
- Specific timeframe
---
**Version:** 1.0.0

View File

@@ -0,0 +1,46 @@
---
name: results-interpretation
description: "Interpret statistical results correctly and comprehensively. Use when: (1) Writing results sections, (2) Discussing findings, (3) Avoiding common misinterpretations, (4) Reporting effect sizes and confidence intervals."
allowed-tools: Read, Write
version: 1.0.0
---
# Results Interpretation Skill
## Purpose
Correctly interpret and report statistical findings with appropriate nuance.
## Key Principles
**1. Effect Size > p-value**
- Report effect sizes with 95% CI
- Statistical significance ≠ practical importance
**2. Confidence Intervals**
- Range of plausible values
- Precision of estimate
- If CI includes 0, not statistically significant
**3. P-values**
- Probability of data given H0
- NOT: Probability H0 is true
- NOT: Probability of replication
**4. Multiple Comparisons**
- Adjust alpha if running many tests
- Distinguish primary vs exploratory
## Correct Reporting
**Example:**
"The intervention group showed higher scores (M=45.2, SD=8.3) than control (M=37.8, SD=9.1), t(98)=3.45, p<.001, d=0.69, 95% CI[0.29, 1.09]. This represents a medium-to-large effect."
**Include:**
- Descriptive statistics
- Test statistic and df
- P-value
- Effect size with CI
- Interpretation
---
**Version:** 1.0.0

View File

@@ -0,0 +1,53 @@
---
name: risk-of-bias
description: "Assess risk of bias in research studies for systematic reviews. Use when: (1) Conducting systematic reviews, (2) Evaluating study quality, (3) GRADE assessments, (4) Meta-analysis planning."
allowed-tools: Read, Write
version: 1.0.0
---
# Risk of Bias Assessment Skill
## Purpose
Systematically evaluate methodological quality and potential biases in research studies.
## Tools by Study Design
**RCTs**: Cochrane Risk of Bias 2 (RoB 2)
**Non-randomized**: ROBINS-I
**Diagnostic Studies**: QUADAS-2
**Prognostic Studies**: QUIPS
## RoB 2 Domains
1. **Randomization Process**
- Adequate sequence generation?
- Allocation concealment?
2. **Deviations from Interventions**
- Blinding effective?
- Adherence to protocol?
3. **Missing Outcome Data**
- Complete data?
- Appropriate handling?
4. **Outcome Measurement**
- Blinded assessment?
- Validated measures?
5. **Selective Reporting**
- All outcomes reported?
- Pre-registered?
## Risk Levels
- **Low**: Minimal bias
- **Some concerns**: Potential bias
- **High**: Serious bias likely
## Integration
Use with literature-reviewer agent for systematic quality assessment of included studies.
---
**Version:** 1.0.0

View File

@@ -0,0 +1,53 @@
---
name: sensitivity-analysis
description: "Conduct sensitivity analyses to test robustness of findings. Use when: (1) Testing assumption violations, (2) Meta-analysis robustness, (3) Handling missing data, (4) Examining outliers."
allowed-tools: Read, Write, Bash
version: 1.0.0
---
# Sensitivity Analysis Skill
## Purpose
Test whether findings are robust to analytical decisions and assumptions.
## Types of Sensitivity Analyses
**1. Exclusion Analyses**
- Remove outliers
- Remove high risk-of-bias studies
- One-study-removed analysis
**2. Analytical Decisions**
- Different statistical tests
- Parametric vs non-parametric
- Different transformations
**3. Missing Data**
- Complete case analysis
- Best-case scenario
- Worst-case scenario
- Multiple imputation
**4. Measurement**
- Different outcome definitions
- Different time points
- Alternative scoring methods
## Interpretation
**Robust Findings:**
- Results consistent across analyses
- Conclusions unchanged
- High confidence
**Sensitive Findings:**
- Results vary by decision
- Interpret with caution
- Report uncertainty
## Example
"Results were robust to removal of the highest risk-of-bias study (d=0.48 vs d=0.52) and remained significant when using non-parametric tests (p=.002)."
---
**Version:** 1.0.0

View File

@@ -0,0 +1,54 @@
---
name: subgroup-analysis
description: "Conduct subgroup analyses to examine effect moderation. Use when: (1) Testing pre-specified moderators, (2) Exploring heterogeneity, (3) Identifying differential effects, (4) Meta-analysis synthesis."
allowed-tools: Read, Write, Bash
version: 1.0.0
---
# Subgroup Analysis Skill
## Purpose
Examine whether effects differ across subgroups of participants or studies.
## Planning Subgroup Analyses
**Pre-Specify:**
- Which subgroups (age, sex, severity)
- Rationale for each
- Statistical approach
**Limit Number:**
- Too many = Type I error inflation
- Focus on theoretically important
- Correct for multiple comparisons
## Subgroup Analysis Methods
**1. Interaction Tests**
- Test group × subgroup interaction
- More powerful than separate analyses
**2. Stratified Analysis**
- Separate analysis per subgroup
- Compare effect sizes
**3. Meta-Regression**
- Continuous moderators
- Multiple moderators simultaneously
## Interpretation Cautions
**Observational Nature:**
- Subgroups not randomized
- Confounding possible
- Generate hypotheses, don't confirm
**Multiple Testing:**
- Adjust alpha (Bonferroni)
- Or report as exploratory
**Example:**
"Pre-specified subgroup analysis showed larger effects in younger participants (<50 years, d=0.68) versus older (≥50 years, d=0.34), interaction p=.02."
---
**Version:** 1.0.0

View File

@@ -0,0 +1,48 @@
---
name: synthesis-matrix
description: "Create evidence synthesis matrices for systematic reviews. Use when: (1) Organizing extracted data, (2) Comparing study characteristics, (3) Identifying patterns across studies, (4) Preparing synthesis for manuscripts."
allowed-tools: Read, Write
version: 1.0.0
---
# Evidence Synthesis Matrix Skill
## Purpose
Organize and synthesize evidence across multiple studies using structured matrices.
## Matrix Structure
| Study | Population | Design | Intervention | Outcome | Effect Size | Quality |
|-------|-----------|---------|--------------|---------|------------|---------|
| Smith 2023 | N=120, Adults | RCT | CBT vs WL | Depression | d=0.65 | Low RoB |
| Jones 2022 | N=85, Adolescents | RCT | CBT vs TAU | Depression | d=0.42 | Some concerns |
## Key Elements
**Study Characteristics:**
- Author, year
- Sample size
- Population details
**Methods:**
- Study design
- Intervention details
- Comparison group
- Follow-up duration
**Results:**
- Primary outcomes
- Effect sizes with CI
- Statistical significance
**Quality:**
- Risk of bias assessment
- GRADE rating
- Limitations
## Integration with Agents
Use with literature-reviewer agent to automatically populate matrices from extracted data.
---
**Version:** 1.0.0