# Evidence Synthesis and Guideline Integration Guide ## Overview Evidence synthesis involves systematically reviewing, analyzing, and integrating research findings to inform clinical recommendations. This guide covers guideline sources, evidence hierarchies, systematic reviews, meta-analyses, and integration of multiple evidence streams for clinical decision support. ## Major Clinical Practice Guidelines ### Oncology Guidelines **NCCN (National Comprehensive Cancer Network)** - **Scope**: 60+ cancer types, supportive care guidelines - **Update Frequency**: Continuous (online), 1-3 updates per year per guideline - **Evidence Categories**: - **Category 1**: High-level evidence, uniform NCCN consensus - **Category 2A**: Lower-level evidence, uniform consensus (appropriate) - **Category 2B**: Lower-level evidence, non-uniform consensus (appropriate) - **Category 3**: Major disagreement or insufficient evidence - **Access**: Free for patients, subscription for providers (institutional access common) - **Application**: US-focused, most widely used in clinical practice **ASCO (American Society of Clinical Oncology)** - **Scope**: Evidence-based clinical practice guidelines - **Methodology**: Systematic review, GRADE-style evidence tables - **Endorsements**: Often endorses NCCN, ESMO, or other guidelines - **Focused Topics**: Specific clinical questions (e.g., biomarker testing, supportive care) - **Guideline Products**: Full guidelines, rapid recommendations, endorsements - **Quality**: Rigorous methodology, peer-reviewed publication **ESMO (European Society for Medical Oncology)** - **Scope**: European guidelines for cancer management - **Evidence Levels**: - **I**: Evidence from at least one large RCT or meta-analysis - **II**: Evidence from at least one well-designed non-randomized trial, cohort study - **III**: Evidence from well-designed non-experimental study - **IV**: Evidence from expert committee reports or opinions - **V**: Evidence from case series, case reports - **Recommendation Grades**: - **A**: Strong evidence for efficacy, substantial clinical benefit (strongly recommended) - **B**: Strong or moderate evidence, limited clinical benefit (generally recommended) - **C**: Insufficient evidence, benefit not sufficiently well established - **D**: Moderate evidence against efficacy or for adverse effects (not recommended) - **E**: Strong evidence against efficacy (never recommended) - **ESMO-MCBS**: Magnitude of Clinical Benefit Scale (grades 1-5 for meaningful benefit) ### Cardiovascular Guidelines **AHA/ACC (American Heart Association / American College of Cardiology)** - **Scope**: Cardiovascular disease prevention, diagnosis, management - **Class of Recommendation (COR)**: - **Class I**: Strong recommendation - should be performed/administered - **Class IIa**: Moderate recommendation - is reasonable - **Class IIb**: Weak recommendation - may be considered - **Class III - No Benefit**: Not recommended - **Class III - Harm**: Potentially harmful - **Level of Evidence (LOE)**: - **A**: High-quality evidence from >1 RCT, meta-analyses - **B-R**: Moderate-quality evidence from ≥1 RCT - **B-NR**: Moderate-quality evidence from non-randomized studies - **C-LD**: Limited data from observational studies, registries - **C-EO**: Expert opinion based on clinical experience - **Example**: "Statin therapy is recommended for adults with LDL-C ≥190 mg/dL (Class I, LOE A)" **ESC (European Society of Cardiology)** - **Scope**: European cardiovascular guidelines - **Class of Recommendation**: - **I**: Recommended or indicated - **II**: Should be considered - **III**: Not recommended - **Level of Evidence**: A (RCTs), B (single RCT or observational), C (expert opinion) ### Other Specialties **IDSA (Infectious Diseases Society of America)** - Antimicrobial guidelines, infection management - GRADE methodology - Strong vs weak recommendations **ATS/ERS (American Thoracic Society / European Respiratory Society)** - Respiratory disease management - GRADE methodology **ACR (American College of Rheumatology)** - Rheumatic disease guidelines - Conditionally recommended vs strongly recommended **KDIGO (Kidney Disease: Improving Global Outcomes)** - Chronic kidney disease, dialysis, transplant - GRADE-based recommendations ## GRADE Methodology ### Assessing Quality of Evidence **Initial Quality Assignment** **Randomized Controlled Trials**: Start at HIGH quality (⊕⊕⊕⊕) **Observational Studies**: Start at LOW quality (⊕⊕○○) ### Factors Decreasing Quality (Downgrade) **Risk of Bias** (-1 or -2 levels) - Lack of allocation concealment - Lack of blinding - Incomplete outcome data - Selective outcome reporting - Other sources of bias **Inconsistency** (-1 or -2 levels) - Unexplained heterogeneity in results across studies - Wide variation in effect estimates - Non-overlapping confidence intervals - High I² statistic in meta-analysis (>50-75%) **Indirectness** (-1 or -2 levels) - Different population than target (younger patients in trials, applying to elderly) - Different intervention (higher dose in trial than used in practice) - Different comparator (placebo in trial, comparing to active treatment) - Surrogate outcomes (PFS) when interested in survival (OS) **Imprecision** (-1 or -2 levels) - Wide confidence intervals crossing threshold of benefit/harm - Small sample size, few events - Optimal information size (OIS) not met - Rule of thumb: <300 events for continuous outcomes, <200 events for dichotomous **Publication Bias** (-1 level) - Funnel plot asymmetry (if ≥10 studies) - Known unpublished studies with negative results - Selective outcome reporting - Industry-sponsored studies only ### Factors Increasing Quality (Upgrade - Observational Only) **Large Magnitude of Effect** (+1 or +2 levels) - +1: RR >2 or <0.5 (moderate effect) - +2: RR >5 or <0.2 (large effect) - No plausible confounders would reduce effect **Dose-Response Gradient** (+1 level) - Clear dose-response or duration-response relationship - Strengthens causal inference **All Plausible Confounders Would Reduce Effect** (+1 level) - Observed effect despite confounders biasing toward null - Rare, requires careful justification ### Final Quality Rating After adjustments, assign final quality: - **High (⊕⊕⊕⊕)**: Very confident in effect estimate - **Moderate (⊕⊕⊕○)**: Moderately confident; true effect likely close to estimate - **Low (⊕⊕○○)**: Limited confidence; true effect may be substantially different - **Very Low (⊕○○○)**: Very little confidence; true effect likely substantially different ## Systematic Reviews and Meta-Analyses ### PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) **Search Strategy** - **Databases**: PubMed/MEDLINE, Embase, Cochrane Library, Web of Science - **Search Terms**: PICO (Population, Intervention, Comparator, Outcome) - **Date Range**: Typically last 10-20 years or comprehensive - **Language**: English only or all languages with translation - **Grey Literature**: Conference abstracts, trial registries, unpublished data **Study Selection** ``` PRISMA Flow Diagram: Records identified through database searching (n=2,450) Additional records through other sources (n=15) ↓ Records after duplicates removed (n=1,823) ↓ Records screened (title/abstract) (n=1,823) → Excluded (n=1,652) ↓ - Not relevant topic (n=1,120) Full-text articles assessed (n=171) - Animal studies (n=332) ↓ - Reviews (n=200) Studies included in qualitative synthesis (n=38) → Excluded (n=133) ↓ - Wrong population (n=42) Studies included in meta-analysis (n=24) - Wrong intervention (n=35) - No outcomes reported (n=28) - Duplicate data (n=18) - Poor quality (n=10) ``` **Data Extraction** - Study characteristics: Design, sample size, population, intervention - Results: Outcomes, effect sizes, confidence intervals, p-values - Quality assessment: Risk of bias tool (Cochrane RoB 2.0 for RCTs) - Dual extraction: Two reviewers independently, resolve disagreements ### Meta-Analysis Methods **Fixed-Effect Model** - **Assumption**: Single true effect size shared by all studies - **Weighting**: By inverse variance (larger studies have more weight) - **Application**: When heterogeneity is low (I² <25%) - **Interpretation**: Estimate of common effect across studies **Random-Effects Model** - **Assumption**: True effect varies across studies (distribution of effects) - **Weighting**: By inverse variance + between-study variance - **Application**: When heterogeneity moderate to high (I² ≥25%) - **Interpretation**: Estimate of average effect (center of distribution) - **Wider CI**: Accounts for heterogeneity, more conservative **Heterogeneity Assessment** **I² Statistic** - Percentage of variability due to heterogeneity rather than chance - I² = 0-25%: Low heterogeneity - I² = 25-50%: Moderate heterogeneity - I² = 50-75%: Substantial heterogeneity - I² = 75-100%: Considerable heterogeneity **Q Test (Cochran's Q)** - Test for heterogeneity - p<0.10 suggests significant heterogeneity (liberal threshold) - Low power when few studies, use I² as primary measure **Tau² (τ²)** - Estimate of between-study variance - Used in random-effects weighting **Subgroup Analysis** - Explore sources of heterogeneity - Pre-specified subgroups: Disease stage, biomarker status, treatment regimen - Test for interaction between subgroups **Forest Plot Interpretation** ``` Study n HR (95% CI) Weight ───────────────────────────────────────────────────────────── Trial A 2018 450 0.62 (0.45-0.85) ●───┤ 28% Trial B 2019 320 0.71 (0.49-1.02) ●────┤ 22% Trial C 2020 580 0.55 (0.41-0.74) ●──┤ 32% Trial D 2021 210 0.88 (0.56-1.38) ●──────┤ 18% Overall (RE model) 1560 0.65 (0.53-0.80) ◆──┤ Heterogeneity: I²=42%, p=0.16 0.25 0.5 1.0 2.0 4.0 Favors Treatment Favors Control ``` ## Guideline Integration ### Concordance Checking **Multi-Guideline Comparison** ``` Recommendation: First-line treatment for advanced NSCLC, PD-L1 ≥50% Guideline Version Recommendation Strength ───────────────────────────────────────────────────────────────────────────── NCCN v4.2024 Pembrolizumab monotherapy (preferred) Category 1 ESMO 2023 Pembrolizumab monotherapy (preferred) I, A ASCO 2022 Endorses NCCN guidelines Strong NICE (UK) 2023 Pembrolizumab approved Recommended Synthesis: Strong consensus across guidelines for pembrolizumab monotherapy. Alternative: Pembrolizumab + chemotherapy also Category 1/I-A recommended. ``` **Discordance Resolution** - Identify differences and reasons (geography, cost, access, evidence interpretation) - Note date of each guideline (newer may incorporate recent trials) - Consider regional applicability - Favor guidelines with most rigorous methodology (GRADE-based) ### Regulatory Approval Landscape **FDA Approvals** - Track indication-specific approvals - Accelerated approval vs full approval - Post-marketing requirements - Contraindications and warnings **EMA (European Medicines Agency)** - May differ from FDA in approved indications - Conditional marketing authorization - Additional monitoring (black triangle) **Regional Variations** - Health Technology Assessment (HTA) agencies - NICE (UK): Cost-effectiveness analysis, QALY thresholds - CADTH (Canada): Therapeutic review and recommendations - PBAC (Australia): Reimbursement decisions ## Real-World Evidence (RWE) ### Sources of RWE **Electronic Health Records (EHR)** - Clinical data from routine practice - Large patient numbers - Heterogeneous populations (more generalizable than RCTs) - Limitations: Missing data, inconsistent documentation, selection bias **Claims Databases** - Administrative claims for billing/reimbursement - Large scale (millions of patients) - Outcomes: Mortality, hospitalizations, procedures - Limitations: Lack clinical detail (labs, imaging, biomarkers) **Cancer Registries** - **SEER (Surveillance, Epidemiology, and End Results)**: US cancer registry - **NCDB (National Cancer Database)**: Hospital registry data - Population-level survival, treatment patterns - Limited treatment detail, no toxicity data **Prospective Cohorts** - Framingham Heart Study, Nurses' Health Study - Long-term follow-up, rich covariate data - Expensive, time-consuming ### RWE Applications **Comparative Effectiveness** - Compare treatments in real-world settings (less strict eligibility than RCTs) - Complement RCT data with broader populations - Example: Effectiveness of immunotherapy in elderly, poor PS patients excluded from trials **Safety Signal Detection** - Rare adverse events not detected in trials - Long-term toxicities - Drug-drug interactions in polypharmacy - Postmarketing surveillance **Treatment Patterns and Access** - Guideline adherence in community practice - Time to treatment initiation - Disparities in care delivery - Off-label use prevalence **Limitations of RWE** - **Confounding by indication**: Sicker patients receive more aggressive treatment - **Immortal time bias**: Time between events affecting survival estimates - **Missing data**: Incomplete or inconsistent data collection - **Causality**: Association does not prove causation without randomization **Strengthening RWE** - **Propensity score matching**: Balance baseline characteristics between groups - **Multivariable adjustment**: Adjust for measured confounders in Cox model - **Sensitivity analyses**: Test robustness to unmeasured confounding - **Instrumental variables**: Use natural experiments to approximate randomization ## Meta-Analysis Techniques ### Binary Outcomes (Response Rate, Event Rate) **Effect Measures** - **Risk Ratio (RR)**: Ratio of event probabilities - **Odds Ratio (OR)**: Ratio of odds (less intuitive) - **Risk Difference (RD)**: Absolute difference in event rates **Example Calculation** ``` Study 1: - Treatment A: 30/100 responded (30%) - Treatment B: 15/100 responded (15%) - RR = 0.30/0.15 = 2.0 (95% CI 1.15-3.48) - RD = 0.30 - 0.15 = 0.15 or 15% (95% CI 4.2%-25.8%) - NNT = 1/RD = 1/0.15 = 6.7 (treat 7 patients to get 1 additional response) ``` **Pooling Methods** - **Mantel-Haenszel**: Common fixed-effect method - **DerSimonian-Laird**: Random-effects method - **Peto**: For rare events (event rate <1%) ### Time-to-Event Outcomes (Survival, PFS) **Hazard Ratio Pooling** - Extract HR and 95% CI (or log(HR) and SE) from each study - Weight by inverse variance - Pool using generic inverse variance method - Report pooled HR with 95% CI, heterogeneity statistics **When HR Not Reported** - Extract from Kaplan-Meier curves (Parmar method, digitizing software) - Calculate from log-rank p-value and event counts - Request from study authors ### Continuous Outcomes (Quality of Life, Lab Values) **Standardized Mean Difference (SMD)** - Application: Different scales used across studies - SMD = (Mean₁ - Mean₂) / Pooled SD - Interpretation: Cohen's d effect size (0.2 small, 0.5 medium, 0.8 large) **Mean Difference (MD)** - Application: Same scale/unit used across studies - MD = Mean₁ - Mean₂ - More directly interpretable than SMD ## Network Meta-Analysis ### Purpose Compare multiple treatments simultaneously when no head-to-head trials exist **Example Scenario** - Drug A vs placebo (Trial 1) - Drug B vs placebo (Trial 2) - Drug C vs Drug A (Trial 3) - **Question**: How does Drug B compare to Drug C? (no direct comparison) ### Methods **Fixed-Effect Network Meta-Analysis** - Assumes consistency (transitivity): A vs B effect = (A vs C effect) - (B vs C effect) - Provides indirect comparison estimates - Ranks treatments by P-score or SUCRA **Random-Effects Network Meta-Analysis** - Allows heterogeneity between studies - More conservative estimates **Consistency Checking** - Compare direct vs indirect evidence for same comparison - Node-splitting analysis - Loop consistency (if closed loops in network) ### Interpretation Cautions - **Transitivity assumption**: May not hold if studies differ in important ways - **Indirect evidence**: Less reliable than direct head-to-head trials - **Rankings**: Probabilistic, not definitive ordering - **Clinical judgment**: Consider beyond statistical rankings ## Evidence Tables ### Constructing Evidence Summary Tables **PICO Framework** - **P (Population)**: Patient characteristics, disease stage, biomarker status - **I (Intervention)**: Treatment regimen, dose, schedule - **C (Comparator)**: Control arm (placebo, standard of care) - **O (Outcomes)**: Primary and secondary endpoints **Evidence Table Template** ``` Study Design n Population Intervention vs Comparator Outcome Result Quality ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Smith 2020 RCT 450 Advanced NSCLC Drug A 10mg vs Median PFS 12 vs 6 months High EGFR+ standard chemo (95% CI) (10-14 vs 5-7) ⊕⊕⊕⊕ HR (95% CI) 0.48 (0.36-0.64) p-value p<0.001 ORR 65% vs 35% Grade 3-4 AEs 42% vs 38% Jones 2021 RCT 380 Advanced NSCLC Drug A 10mg vs Median PFS 10 vs 5.5 months High EGFR+ placebo HR (95% CI) 0.42 (0.30-0.58) ⊕⊕⊕⊕ p-value p<0.001 Pooled Effect Pooled HR 0.45 (0.36-0.57) High (Meta-analysis) I² 12% (low heterogeneity) ⊕⊕⊕⊕ ``` ### Evidence to Decision Framework **Benefits and Harms** - Magnitude of desirable effects (ORR, PFS, OS improvement) - Magnitude of undesirable effects (toxicity, quality of life impact) - Balance of benefits and harms - Net benefit calculation **Values and Preferences** - How do patients value outcomes? (survival vs quality of life) - Variability in patient values - Shared decision-making importance **Resource Considerations** - Cost of intervention - Cost-effectiveness ($/QALY) - Budget impact - Equity and access **Feasibility and Acceptability** - Is treatment available in practice settings? - Route of administration feasible? (oral vs IV vs subcutaneous) - Monitoring requirements realistic? - Patient and provider acceptability ## Guideline Concordance Documentation ### Synthesizing Multiple Guidelines **Concordant Recommendations** ``` Clinical Question: Treatment for HER2+ metastatic breast cancer, first-line Guideline Summary: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ NCCN v3.2024 (Category 1): Preferred: Pertuzumab + trastuzumab + taxane Alternative: T-DM1, other HER2-targeted combinations ESMO 2022 (Grade I, A): Preferred: Pertuzumab + trastuzumab + docetaxel Alternative: Trastuzumab + chemotherapy (if pertuzumab unavailable) ASCO 2020 Endorsement: Endorses NCCN guidelines, recommends pertuzumab-based first-line Synthesis: Strong consensus for pertuzumab + trastuzumab + taxane as first-line standard. Evidence: CLEOPATRA trial (Swain 2015): median OS 56.5 vs 40.8 months (HR 0.68, p<0.001) Recommendation: Pertuzumab 840 mg IV loading then 420 mg + trastuzumab 8 mg/kg loading then 6 mg/kg + docetaxel 75 mg/m² every 3 weeks until progression. Strength: Strong (GRADE 1A) Evidence: High-quality, multiple RCTs, guideline concordance ``` **Discordant Recommendations** ``` Clinical Question: Adjuvant osimertinib for resected EGFR+ NSCLC NCCN v4.2024 (Category 1): Osimertinib 80 mg daily × 3 years after adjuvant chemotherapy Evidence: ADAURA trial (median DFS not reached vs 28 months, HR 0.17) ESMO 2023 (II, B): Osimertinib may be considered Note: Cost-effectiveness concerns, OS data immature NICE (UK) 2022: Not recommended for routine use Reason: QALY analysis unfavorable at current pricing Synthesis: Efficacy demonstrated in phase 3 trial (ADAURA), FDA/EMA approved. Guideline discordance based on cost-effectiveness, not clinical efficacy. US practice: NCCN Category 1, widely adopted European/UK: Variable adoption based on national HTA decisions Recommendation Context-Dependent: US: Strong recommendation if accessible (GRADE 1B) Countries with cost constraints: Conditional recommendation (GRADE 2B) ``` ## Quality Assessment Tools ### RCT Quality Assessment (Cochrane Risk of Bias 2.0) **Domains** 1. **Bias from randomization process**: Sequence generation, allocation concealment 2. **Bias from deviations from intended interventions**: Blinding, protocol adherence 3. **Bias from missing outcome data**: Attrition, intention-to-treat analysis 4. **Bias in outcome measurement**: Blinded assessment, objective outcomes 5. **Bias in selection of reported result**: Selective reporting, outcome switching **Judgment**: Low risk, some concerns, high risk (for each domain) **Overall Risk of Bias**: Based on highest-risk domain ### Observational Study Quality (Newcastle-Ottawa Scale) **Selection (max 4 stars)** - Representativeness of exposed cohort - Selection of non-exposed cohort - Ascertainment of exposure - Outcome not present at start **Comparability (max 2 stars)** - Comparability of cohorts (design/analysis adjustment for confounders) **Outcome (max 3 stars)** - Assessment of outcome - Follow-up duration adequate - Adequacy of follow-up (low attrition) **Total Score**: 0-9 stars - **High quality**: 7-9 stars - **Moderate quality**: 4-6 stars - **Low quality**: 0-3 stars ## Translating Evidence to Recommendations ### Recommendation Development Process **Step 1: PICO Question Formulation** ``` Example PICO: P - Population: Adults with type 2 diabetes and cardiovascular disease I - Intervention: SGLT2 inhibitor (empagliflozin) C - Comparator: Placebo (added to standard care) O - Outcomes: Major adverse cardiovascular events (3P-MACE), hospitalization for heart failure ``` **Step 2: Systematic Evidence Review** - Identify all relevant studies - Assess quality using standardized tools - Extract outcome data - Synthesize findings (narrative or meta-analysis) **Step 3: GRADE Evidence Rating** - Start at high (RCTs) or low (observational) - Downgrade for risk of bias, inconsistency, indirectness, imprecision, publication bias - Upgrade for large effect, dose-response, confounders reducing effect (observational only) - Assign final quality rating **Step 4: Recommendation Strength Determination** **Strong Recommendation (Grade 1)** - Desirable effects clearly outweigh undesirable effects - High or moderate quality evidence - Little variability in patient values - Intervention cost-effective **Conditional Recommendation (Grade 2)** - Trade-offs: Desirable and undesirable effects closely balanced - Low or very low quality evidence - Substantial variability in patient values/preferences - Uncertain cost-effectiveness **Step 5: Wording the Recommendation** ``` Strong: "We recommend..." Example: "We recommend SGLT2 inhibitor therapy for adults with type 2 diabetes and established cardiovascular disease to reduce risk of hospitalization for heart failure and cardiovascular death (Strong recommendation, high-quality evidence - GRADE 1A)." Conditional: "We suggest..." Example: "We suggest considering GLP-1 receptor agonist therapy for adults with type 2 diabetes and CKD to reduce risk of kidney disease progression (Conditional recommendation, moderate-quality evidence - GRADE 2B)." ``` ## Incorporating Emerging Evidence ### Early-Phase Trial Data **Phase 1 Trials** - Purpose: Dose-finding, safety - Outcomes: Maximum tolerated dose (MTD), dose-limiting toxicities (DLTs), pharmacokinetics - Evidence level: Very low (expert opinion, case series) - Clinical application: Investigational only, clinical trial enrollment **Phase 2 Trials** - Purpose: Preliminary efficacy signal - Design: Single-arm (ORR primary endpoint) or randomized (PFS comparison) - Evidence level: Low to moderate - Clinical application: May support off-label use in refractory settings, clinical trial enrollment preferred **Phase 3 Trials** - Purpose: Confirmatory efficacy and safety - Design: Randomized controlled trial, OS or PFS primary endpoint - Evidence level: High (if well-designed and executed) - Clinical application: Regulatory approval basis, guideline recommendations **Phase 4 Trials** - Purpose: Post-marketing surveillance, additional indications - Evidence level: Variable (depends on design) - Clinical application: Safety monitoring, expanded usage ### Breakthrough Therapy Designation **FDA Fast-Track Programs** - **Breakthrough Therapy**: Preliminary evidence of substantial improvement over existing therapy - **Accelerated Approval**: Approval based on surrogate endpoint (PFS, ORR) - Post-marketing requirement: Confirmatory OS trial - **Priority Review**: Shortened FDA review time (6 vs 10 months) **Implications for Guidelines** - May receive NCCN Category 2A before phase 3 data mature - Upgrade to Category 1 when confirmatory data published - Monitor for post-market confirmatory trial results ### Updating Recommendations **Triggers for Update** - New phase 3 trial results (major journal publication) - FDA/EMA approval for new indication or agent - Guideline update from NCCN, ASCO, ESMO - Safety alert or drug withdrawal - Meta-analysis changing effect estimates **Rapid Update Process** - Critical appraisal of new evidence - Assess impact on current recommendations - Revise evidence grade and recommendation strength if needed - Disseminate update to users - Version control and change log ## Conflicts of Interest and Bias ### Identifying Potential Bias **Study Sponsorship** - **Industry-sponsored**: May favor sponsor's product (publication bias, outcome selection) - **Academic**: May favor investigator's hypothesis - **Independent**: Government funding (NIH, PCORI) **Author Conflicts of Interest** - Consulting fees, research funding, stock ownership - Disclosure statements required by journals - ICMJE Form for Disclosure of Potential COI **Mitigating Bias** - Register trials prospectively (ClinicalTrials.gov) - Pre-specify primary endpoint and analysis plan - Independent data monitoring committee (IDMC) - Blinding of outcome assessors - Intention-to-treat analysis ### Transparency in Evidence Synthesis **Pre-Registration** - PROSPERO for systematic reviews - Pre-specify PICO, search strategy, outcomes, analysis plan - Prevents post-hoc changes to avoid negative findings **Reporting Checklists** - PRISMA for systematic reviews/meta-analyses - CONSORT for RCTs - STROBE for observational studies **Data Availability** - Individual patient data (IPD) sharing increases transparency - Repositories: ClinicalTrials.gov results database, journal supplements ## Practical Application ### Evidence Summary for Clinical Document ``` EVIDENCE SYNTHESIS: Osimertinib for EGFR-Mutated NSCLC Clinical Question: Should adults with treatment-naïve advanced NSCLC harboring EGFR exon 19 deletion or L858R mutation receive osimertinib versus first-generation EGFR TKIs? Evidence Review: ┌──────────────────────────────────────────────────────────────────────┐ │ FLAURA Trial (Soria et al., NEJM 2018) │ ├──────────────────────────────────────────────────────────────────────┤ │ Design: Phase 3 RCT, double-blind, 1:1 randomization │ │ Population: EGFR exon 19 del or L858R, stage IIIB/IV, ECOG 0-1 │ │ Sample Size: n=556 (279 osimertinib, 277 comparator) │ │ Intervention: Osimertinib 80 mg PO daily │ │ Comparator: Gefitinib 250 mg or erlotinib 150 mg PO daily │ │ Primary Endpoint: PFS by investigator assessment │ │ Secondary: OS, ORR, DOR, CNS progression, safety │ │ │ │ Results: │ │ - Median PFS: 18.9 vs 10.2 months (HR 0.46, 95% CI 0.37-0.57, p<0.001)│ │ - Median OS: 38.6 vs 31.8 months (HR 0.80, 95% CI 0.64-1.00, p=0.046)│ │ - ORR: 80% vs 76% (p=0.24) │ │ - Grade ≥3 AEs: 34% vs 45% │ │ - Quality: High (well-designed RCT, low risk of bias) │ └──────────────────────────────────────────────────────────────────────┘ Guideline Recommendations: NCCN v4.2024: Category 1 preferred ESMO 2022: Grade I, A ASCO 2022: Endorsed GRADE Assessment: Quality of Evidence: ⊕⊕⊕⊕ HIGH - Randomized controlled trial - Low risk of bias (allocation concealment, blinding, ITT analysis) - Consistent results (single large trial, consistent with phase 2 data) - Direct evidence (target population and outcomes) - Precise estimate (narrow CI, sufficient events) - No publication bias concerns Balance of Benefits and Harms: - Large PFS benefit (8.7 month improvement, HR 0.46) - OS benefit (6.8 month improvement, HR 0.80) - Similar ORR, improved tolerability (lower grade 3-4 AEs) - Desirable effects clearly outweigh undesirable effects Patient Values: Little variability (most patients value survival extension) Cost: Higher cost than first-gen TKIs, but widely accessible in developed countries FINAL RECOMMENDATION: Osimertinib 80 mg PO daily is recommended as first-line therapy for adults with advanced NSCLC harboring EGFR exon 19 deletion or L858R mutation. Strength: STRONG (Grade 1) Quality of Evidence: HIGH (⊕⊕⊕⊕) GRADE: 1A ``` ## Keeping Current ### Literature Surveillance **Automated Alerts** - PubMed My NCBI (save searches, email alerts) - Google Scholar alerts for specific topics - Journal table of contents alerts (NEJM, Lancet, JCO) - Guideline update notifications (NCCN, ASCO, ESMO email lists) **Conference Monitoring** - ASCO Annual Meeting (June) - ESMO Congress (September) - ASH Annual Meeting (December, hematology) - AHA Scientific Sessions (November, cardiology) - Plenary and press releases for practice-changing trials **Trial Results Databases** - ClinicalTrials.gov results database - FDA approval letters and reviews - EMA European public assessment reports (EPARs) ### Critical Appraisal Workflow **Weekly Review** 1. Screen new publications (title/abstract) 2. Full-text review of relevant studies 3. Quality assessment using checklists 4. Extract key findings 5. Assess impact on current recommendations **Monthly Synthesis** 1. Review accumulated evidence 2. Identify practice-changing findings 3. Update evidence tables 4. Revise recommendations if warranted 5. Disseminate updates to clinical teams **Annual Comprehensive Review** 1. Systematic review of guideline updates 2. Re-assess all recommendations 3. Incorporate year's evidence 4. Major version release 5. Continuing education activities