Files
gh-k-dense-ai-claude-scient…/skills/hypothesis-generation/references/literature_search_strategies.md
2025-11-30 08:30:10 +08:00

15 KiB

Literature Search Strategies

Effective Techniques for Finding Scientific Evidence

Comprehensive literature search is essential for grounding hypotheses in existing evidence. This reference provides strategies for both PubMed (biomedical literature) and general scientific search.

Search Strategy Framework

Three-Phase Approach

  1. Broad exploration: Understand the landscape and identify key concepts
  2. Focused searching: Target specific mechanisms, theories, or findings
  3. Citation mining: Follow references and related articles from key papers

Clarify search goals:

  • What aspects of the phenomenon need evidence?
  • What types of studies are most relevant (reviews, primary research, methods)?
  • What time frame is relevant (recent only, or historical context)?
  • What level of evidence is needed (mechanistic, correlational, causal)?

PubMed Search Strategies

When to Use PubMed

Use WebFetch with PubMed URLs for:

  • Biomedical and life sciences research
  • Clinical studies and medical literature
  • Molecular, cellular, and physiological mechanisms
  • Disease etiology and pathology
  • Drug and therapeutic research

Effective PubMed Search Techniques

1. Start with Review Articles

Why: Reviews synthesize literature, identify key concepts, and provide comprehensive reference lists.

Search strategy:

  • Add "review" to search terms
  • Use PubMed filters: Article Type → Review, Systematic Review, Meta-Analysis
  • Look for recent reviews (last 2-5 years)

Example searches:

  • https://pubmed.ncbi.nlm.nih.gov/?term=wound+healing+diabetes+review
  • https://pubmed.ncbi.nlm.nih.gov/?term=gut+microbiome+cognition+systematic+review

2. Use MeSH Terms (Medical Subject Headings)

Why: MeSH terms are standardized vocabulary that captures concept variations.

Strategy:

  • PubMed auto-suggests MeSH terms
  • Helps find papers using different terminology for same concept
  • More comprehensive than keyword-only searches

Example:

  • Instead of just "heart attack," use MeSH term "Myocardial Infarction"
  • Captures papers using "MI," "heart attack," "cardiac infarction," etc.

3. Boolean Operators and Advanced Syntax

AND: Narrow search (all terms must be present)

  • diabetes AND wound healing AND inflammation

OR: Broaden search (any term can be present)

  • (Alzheimer OR dementia) AND gut microbiome

NOT: Exclude terms

  • cancer treatment NOT surgery

Quotes: Exact phrases

  • "oxidative stress"

Wildcards: Variations

  • gene* finds gene, genes, genetic, genetics

4. Filter by Publication Type and Date

Publication types:

  • Clinical Trial
  • Meta-Analysis
  • Systematic Review
  • Research Support, NIH
  • Randomized Controlled Trial

Date filters:

  • Recent work (last 2-5 years): Cutting-edge findings
  • Historical work: Foundational studies
  • Specific time periods: Track development of understanding

5. Use "Similar Articles" and "Cited By"

Strategy:

  • Find one highly relevant paper
  • Click "Similar articles" for related work
  • Use cited by tools to find newer work building on it

PubMed Search Examples by Hypothesis Goal

Mechanistic understanding:

https://pubmed.ncbi.nlm.nih.gov/?term=(mechanism+OR+pathway)+AND+[phenomenon]+AND+(molecular+OR+cellular)

Causal relationships:

https://pubmed.ncbi.nlm.nih.gov/?term=[exposure]+AND+[outcome]+AND+(randomized+controlled+trial+OR+cohort+study)

Biomarkers and associations:

https://pubmed.ncbi.nlm.nih.gov/?term=[biomarker]+AND+[disease]+AND+(association+OR+correlation+OR+prediction)

Treatment effectiveness:

https://pubmed.ncbi.nlm.nih.gov/?term=[intervention]+AND+[condition]+AND+(efficacy+OR+effectiveness+OR+clinical+trial)

General Scientific Web Search Strategies

Use WebSearch for:

  • Non-biomedical sciences (physics, chemistry, materials, earth sciences)
  • Interdisciplinary topics
  • Recent preprints and unpublished work
  • Grey literature (technical reports, conference proceedings)
  • Broader context and cross-domain analogies

Effective Web Search Techniques

1. Use Domain-Specific Search Terms

Include field-specific terminology:

  • Chemistry: "mechanism," "reaction pathway," "synthesis"
  • Physics: "model," "theory," "experimental validation"
  • Materials science: "properties," "characterization," "synthesis"
  • Ecology: "population dynamics," "community structure"

2. Target Academic Sources

Search operators:

  • site:arxiv.org - Preprints (physics, CS, math, quantitative biology)
  • site:biorxiv.org - Biology preprints
  • site:edu - Academic institutions
  • filetype:pdf - Academic papers (often)

Example searches:

  • superconductivity high temperature mechanism site:arxiv.org
  • CRISPR off-target effects site:biorxiv.org

3. Search for Authors and Labs

When you find a relevant paper:

  • Search for the authors' other work
  • Find their lab website for unpublished work
  • Identify key research groups in the field

4. Use Google Scholar Approaches

Strategies:

  • Use "Cited by" to find newer related work
  • Use "Related articles" to expand search
  • Set date ranges to focus on recent work
  • Use author: operator to find specific researchers

5. Combine General and Specific Terms

Structure:

  • Specific phenomenon + general concept
  • "tomato plant growth" + "bacterial promotion"
  • "cognitive decline" + "gut microbiome"

Boolean logic:

  • Use quotes for exact phrases: "spike protein mutation"
  • Use OR for alternatives: (transmissibility OR transmission rate)
  • Combine: "spike protein" AND (transmissibility OR virulence) AND mutation

Cross-Database Search Strategies

Comprehensive Literature Search Workflow

  1. Start with reviews (PubMed or Web Search):

    • Identify key concepts and terminology
    • Note influential papers and researchers
    • Understand current state of field
  2. Focused primary research (PubMed):

    • Search for specific mechanisms
    • Find experimental evidence
    • Identify methodologies
  3. Broaden with web search:

    • Find related work in other fields
    • Locate recent preprints
    • Identify analogous systems
  4. Citation mining:

    • Follow references from key papers
    • Use "cited by" to find recent work
    • Track influential studies
  5. Iterative refinement:

    • Add new terms discovered in papers
    • Narrow if too many results
    • Broaden if too few relevant results

Topic-Specific Search Strategies

Mechanisms and Pathways

Goal: Understand how something works

Search components:

  • Phenomenon + "mechanism"
  • Phenomenon + "pathway"
  • Phenomenon + specific molecules/pathways suspected

Examples:

  • diabetic wound healing mechanism inflammation
  • autophagy pathway cancer

Associations and Correlations

Goal: Find what factors are related

Search components:

  • Variable A + Variable B + "association"
  • Variable A + Variable B + "correlation"
  • Variable A + "predicts" + Variable B

Examples:

  • vitamin D cardiovascular disease association
  • gut microbiome diversity predicts cognitive function

Interventions and Treatments

Goal: Evidence for what works

Search components:

  • Intervention + condition + "efficacy"
  • Intervention + condition + "randomized controlled trial"
  • Intervention + condition + "treatment outcome"

Examples:

  • probiotic intervention depression randomized controlled trial
  • exercise intervention cognitive decline efficacy

Methods and Techniques

Goal: How to test hypothesis

Search components:

  • Method name + application area
  • "How to measure" + phenomenon
  • Technique + validation

Examples:

  • CRISPR screen cancer drug resistance
  • measure protein-protein interaction methods

Analogous Systems

Goal: Find insights from related phenomena

Search components:

  • Mechanism + different system
  • Similar phenomenon + different organism/condition

Examples:

  • If studying plant-microbe symbiosis: search nitrogen fixation rhizobia legumes
  • If studying drug resistance: search antibiotic resistance evolution mechanisms

Evaluating Source Quality

Primary Research Quality Indicators

Strong quality signals:

  • Published in reputable journals
  • Large sample sizes (for statistical power)
  • Pre-registered studies (reduces bias)
  • Appropriate controls and methods
  • Consistent with other findings
  • Transparent data and methods

Red flags:

  • No peer review (use cautiously)
  • Conflicts of interest not disclosed
  • Methods not clearly described
  • Extraordinary claims without extraordinary evidence
  • Contradicts large body of evidence without explanation

Review Quality Indicators

Systematic reviews (highest quality):

  • Pre-defined search strategy
  • Explicit inclusion/exclusion criteria
  • Quality assessment of included studies
  • Quantitative synthesis (meta-analysis)

Narrative reviews (variable quality):

  • Expert synthesis of field
  • May have selection bias
  • Useful for context and framing
  • Check author expertise and citations

Allocate Search Time Appropriately

For straightforward hypotheses (30-60 min):

  • 1-2 broad review articles
  • 3-5 targeted primary research papers
  • Quick web search for recent developments

For complex hypotheses (1-3 hours):

  • Multiple reviews for different aspects
  • 10-15 primary research papers
  • Systematic search across databases
  • Citation mining from key papers

For contentious topics (3+ hours):

  • Systematic review approach
  • Identify competing perspectives
  • Track historical development
  • Cross-reference findings

Diminishing Returns

Signs you've searched enough:

  • Finding the same papers repeatedly
  • New searches yield mostly irrelevant papers
  • Sufficient evidence to support/contextualize hypotheses
  • Multiple independent lines of evidence converge

When to search more:

  • Major gaps in understanding remain
  • Conflicting evidence needs resolution
  • Hypothesis seems inconsistent with literature
  • Need specific methodological information

Documenting Search Results

Information to Capture

For each relevant paper:

  • Full citation (authors, year, journal, title)
  • Key findings relevant to hypothesis
  • Study design and methods
  • Limitations noted by authors
  • How it relates to hypothesis

Organizing Findings

Group by:

  • Supporting evidence for hypothesis A, B, C
  • Methodological approaches
  • Conflicting findings requiring explanation
  • Gaps in current knowledge

Synthesis notes:

  • What is well-established?
  • What is controversial or uncertain?
  • What analogies exist in other systems?
  • What methods are commonly used?

Practical Search Workflow

Step-by-Step Process

  1. Define search goals (5 min):

    • What aspects of phenomenon need evidence?
    • What would support or refute hypotheses?
  2. Broad review search (15-20 min):

    • Find 1-3 review articles
    • Skim abstracts for relevance
    • Note key concepts and terminology
  3. Targeted primary research (30-45 min):

    • Search for specific mechanisms/evidence
    • Read abstracts, scan figures and conclusions
    • Follow most promising references
  4. Cross-domain search (15-30 min):

    • Look for analogies in other systems
    • Find recent preprints
    • Identify emerging trends
  5. Citation mining (15-30 min):

    • Follow references from key papers
    • Use "cited by" for recent work
    • Identify seminal studies
  6. Synthesize findings (20-30 min):

    • Summarize evidence for each hypothesis
    • Note patterns and contradictions
    • Identify knowledge gaps

Iteration and Refinement

When initial search is insufficient:

  • Broaden terms if too few results
  • Add specific mechanisms/pathways if too many results
  • Try alternative terminology
  • Search for related phenomena
  • Consult review articles for better search terms

Red flags requiring more search:

  • Only finding weak or indirect evidence
  • All evidence comes from single lab or source
  • Evidence seems inconsistent with basic principles
  • Major aspects of phenomenon lack any relevant literature

Common Search Pitfalls

Pitfalls to Avoid

  1. Confirmation bias: Only seeking evidence supporting preferred hypothesis

    • Solution: Actively search for contradicting evidence
  2. Recency bias: Only considering recent work, missing foundational studies

    • Solution: Include historical searches, track development of ideas
  3. Too narrow: Missing relevant work due to restrictive terms

    • Solution: Use OR operators, try alternative terminology
  4. Too broad: Overwhelmed by irrelevant results

    • Solution: Add specific terms, use filters, combine concepts with AND
  5. Single database: Missing important work in other fields

    • Solution: Search both PubMed and general web, try domain-specific databases
  6. Stopping too soon: Insufficient evidence to ground hypotheses

    • Solution: Set minimum targets (e.g., 2 reviews + 5 primary papers per hypothesis aspect)
  7. Cherry-picking: Citing only supportive papers

    • Solution: Represent full spectrum of evidence, acknowledge contradictions

Special Cases

Emerging Topics (Limited Literature)

When little published work exists:

  • Search for analogous phenomena in related systems
  • Look for preprints (arXiv, bioRxiv)
  • Find conference abstracts and posters
  • Identify theoretical frameworks that may apply
  • Note the limited evidence in hypothesis generation

Controversial Topics (Conflicting Literature)

When evidence is contradictory:

  • Systematically document both sides
  • Look for methodological differences explaining conflict
  • Check for temporal trends (has understanding shifted?)
  • Identify what would resolve the controversy
  • Generate hypotheses explaining the discrepancy

Interdisciplinary Topics

When spanning multiple fields:

  • Search each field's primary databases
  • Use field-specific terminology for each domain
  • Look for bridging papers that cite across fields
  • Consider consulting domain experts
  • Translate concepts between disciplines carefully

Integration with Hypothesis Generation

Using Literature to Inform Hypotheses

Direct applications:

  • Established mechanisms to apply to new contexts
  • Known pathways relevant to phenomenon
  • Similar phenomena in related systems
  • Validated methods for testing

Indirect applications:

  • Analogies from different systems
  • Theoretical frameworks to apply
  • Gaps suggesting novel mechanisms
  • Contradictions requiring resolution

Balancing Literature Dependence

Too literature-dependent:

  • Hypotheses merely restate known mechanisms
  • No novel insights or predictions
  • "Hypotheses" are actually established facts

Too literature-independent:

  • Hypotheses ignore relevant evidence
  • Propose implausible mechanisms
  • Reinvent already-tested ideas
  • Inconsistent with established principles

Optimal balance:

  • Grounded in existing evidence
  • Extend understanding in novel ways
  • Acknowledge both supporting and challenging evidence
  • Generate testable predictions beyond current knowledge