# DiffDock Confidence Scores and Limitations This document provides detailed guidance on interpreting DiffDock confidence scores and understanding the tool's limitations. ## Confidence Score Interpretation DiffDock generates a confidence score for each predicted binding pose. This score indicates the model's certainty about the prediction. ### Score Ranges | Score Range | Confidence Level | Interpretation | |------------|------------------|----------------| | **> 0** | High confidence | Strong prediction, likely accurate binding pose | | **-1.5 to 0** | Moderate confidence | Reasonable prediction, may need validation | | **< -1.5** | Low confidence | Uncertain prediction, requires careful validation | ### Important Notes on Confidence Scores 1. **Not Binding Affinity**: Confidence scores reflect prediction certainty, NOT binding affinity strength - High confidence = model is confident about the structure - Does NOT indicate strong/weak binding affinity 2. **Context-Dependent**: Confidence scores should be adjusted based on system complexity: - **Lower expectations** for: - Large ligands (>500 Da) - Protein complexes with many chains - Unbound protein conformations (may require conformational changes) - Novel protein families not well-represented in training data - **Higher expectations** for: - Drug-like small molecules (150-500 Da) - Single-chain proteins or well-defined binding sites - Proteins similar to those in training data (PDBBind, BindingMOAD) 3. **Multiple Predictions**: DiffDock generates multiple samples per complex (default: 10) - Review top-ranked predictions (by confidence) - Consider clustering similar poses - High-confidence consensus across multiple samples strengthens prediction ## What DiffDock Predicts ### ✅ DiffDock DOES Predict - **Binding poses**: 3D spatial orientation of ligand in protein binding site - **Confidence scores**: Model's certainty about predictions - **Multiple conformations**: Various possible binding modes ### ❌ DiffDock DOES NOT Predict - **Binding affinity**: Strength of protein-ligand interaction (ΔG, Kd, Ki) - **Binding kinetics**: On/off rates, residence time - **ADMET properties**: Absorption, distribution, metabolism, excretion, toxicity - **Selectivity**: Relative binding to different targets ## Scope and Limitations ### Designed For - **Small molecule docking**: Organic compounds typically 100-1000 Da - **Protein targets**: Single or multi-chain proteins - **Small peptides**: Short peptide ligands (< ~20 residues) - **Small nucleic acids**: Short oligonucleotides ### NOT Designed For - **Large biomolecules**: Full protein-protein interactions - Use DiffDock-PP, AlphaFold-Multimer, or RoseTTAFold2NA instead - **Large peptides/proteins**: >20 residues as ligands - **Covalent docking**: Irreversible covalent bond formation - **Metalloprotein specifics**: May not accurately handle metal coordination - **Membrane proteins**: Not specifically trained on membrane-embedded proteins ### Training Data Considerations DiffDock was trained on: - **PDBBind**: Diverse protein-ligand complexes - **BindingMOAD**: Multi-domain protein structures **Implications**: - Best performance on proteins/ligands similar to training data - May underperform on: - Novel protein families - Unusual ligand chemotypes - Allosteric sites not well-represented in training data ## Validation and Complementary Tools ### Recommended Workflow 1. **Generate poses with DiffDock** - Use confidence scores for initial ranking - Consider multiple high-confidence predictions 2. **Visual Inspection** - Examine protein-ligand interactions in molecular viewer - Check for reasonable: - Hydrogen bonds - Hydrophobic interactions - Steric complementarity - Electrostatic interactions 3. **Scoring and Refinement** (choose one or more): - **GNINA**: Deep learning-based scoring function - **Molecular mechanics**: Energy minimization and refinement - **MM/GBSA or MM/PBSA**: Binding free energy estimation - **Free energy calculations**: FEP or TI for accurate affinity prediction 4. **Experimental Validation** - Biochemical assays (IC50, Kd measurements) - Structural validation (X-ray crystallography, cryo-EM) ### Tools for Binding Affinity Assessment DiffDock should be combined with these tools for affinity prediction: - **GNINA**: Fast, accurate scoring function - Github: github.com/gnina/gnina - **AutoDock Vina**: Classical docking and scoring - Website: vina.scripps.edu - **Free Energy Calculations**: - OpenMM + OpenFE - GROMACS + ABFE/RBFE protocols - **MM/GBSA Tools**: - MMPBSA.py (AmberTools) - gmx_MMPBSA ## Performance Optimization ### For Best Results 1. **Protein Preparation**: - Remove water molecules far from binding site - Resolve missing residues if possible - Consider protonation states at physiological pH 2. **Ligand Input**: - Provide reasonable 3D conformers when using structure files - Use canonical SMILES for consistent results - Pre-process with RDKit if needed 3. **Computational Resources**: - GPU strongly recommended (10-100x speedup) - First run pre-computes lookup tables (takes a few minutes) - Batch processing more efficient than single predictions 4. **Parameter Tuning**: - Increase `samples_per_complex` for difficult cases (20-40) - Adjust temperature parameters for diversity/accuracy trade-off - Use pre-computed ESM embeddings for repeated predictions ## Common Issues and Troubleshooting ### Low Confidence Scores - **Large/flexible ligands**: Consider splitting into fragments or use alternative methods - **Multiple binding sites**: May predict multiple locations with distributed confidence - **Protein flexibility**: Consider using ensemble of protein conformations ### Unrealistic Predictions - **Clashes**: May indicate need for protein preparation or refinement - **Surface binding**: Check if true binding site is blocked or unclear - **Unusual poses**: Consider increasing samples to explore more conformations ### Slow Performance - **Use GPU**: Essential for reasonable runtime - **Pre-compute embeddings**: Reuse ESM embeddings for same protein - **Batch processing**: More efficient than sequential individual predictions - **Reduce samples**: Lower `samples_per_complex` for quick screening ## Citation and Further Reading For methodology details and benchmarking results, see: 1. **Original DiffDock Paper** (ICLR 2023): - "DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking" - Corso et al., arXiv:2210.01776 2. **DiffDock-L Paper** (2024): - Enhanced model with improved generalization - Stärk et al., arXiv:2402.18396 3. **PoseBusters Benchmark**: - Rigorous docking evaluation framework - Used for DiffDock validation