5.0 KiB
DiffDock Configuration Parameters Reference
This document provides comprehensive details on all DiffDock configuration parameters and command-line options.
Model & Checkpoint Settings
Model Paths
-
--model_dir: Directory containing the score model checkpoint- Default:
./workdir/v1.1/score_model - DiffDock-L model (current default)
- Default:
-
--confidence_model_dir: Directory containing the confidence model checkpoint- Default:
./workdir/v1.1/confidence_model
- Default:
-
--ckpt: Name of the score model checkpoint file- Default:
best_ema_inference_epoch_model.pt
- Default:
-
--confidence_ckpt: Name of the confidence model checkpoint file- Default:
best_model_epoch75.pt
- Default:
Model Version Flags
-
--old_score_model: Use original DiffDock model instead of DiffDock-L- Default:
false(uses DiffDock-L)
- Default:
-
--old_filtering_model: Use legacy confidence filtering approach- Default:
true
- Default:
Input/Output Options
Input Specification
-
--protein_path: Path to protein PDB file- Example:
--protein_path protein.pdb - Alternative to
--protein_sequence
- Example:
-
--protein_sequence: Amino acid sequence for ESMFold folding- Automatically generates protein structure from sequence
- Alternative to
--protein_path
-
--ligand: Ligand specification (SMILES string or file path)- SMILES string:
--ligand "COc(cc1)ccc1C#N" - File path:
--ligand ligand.sdfor.mol2
- SMILES string:
-
--protein_ligand_csv: CSV file for batch processing- Required columns:
complex_name,protein_path,ligand_description,protein_sequence - Example:
--protein_ligand_csv data/protein_ligand_example.csv
- Required columns:
Output Control
-
--out_dir: Output directory for predictions- Example:
--out_dir results/user_predictions/
- Example:
-
--save_visualisation: Export predicted molecules as SDF files- Enables visualization of results
Inference Parameters
Diffusion Steps
-
--inference_steps: Number of planned inference iterations- Default:
20 - Higher values may improve accuracy but increase runtime
- Default:
-
--actual_steps: Actual diffusion steps executed- Default:
19
- Default:
-
--no_final_step_noise: Omit noise at the final diffusion step- Default:
true
- Default:
Sampling Settings
-
--samples_per_complex: Number of samples to generate per complex- Default:
10 - More samples provide better coverage but increase computation
- Default:
-
--sigma_schedule: Noise schedule type- Default:
expbeta(exponential-beta)
- Default:
-
--initial_noise_std_proportion: Initial noise standard deviation scaling- Default:
1.46
- Default:
Temperature Parameters
Sampling Temperatures (Controls diversity of predictions)
-
--temp_sampling_tr: Translation sampling temperature- Default:
1.17
- Default:
-
--temp_sampling_rot: Rotation sampling temperature- Default:
2.06
- Default:
-
--temp_sampling_tor: Torsion sampling temperature- Default:
7.04
- Default:
Psi Angle Temperatures
-
--temp_psi_tr: Translation psi temperature- Default:
0.73
- Default:
-
--temp_psi_rot: Rotation psi temperature- Default:
0.90
- Default:
-
--temp_psi_tor: Torsion psi temperature- Default:
0.59
- Default:
Sigma Data Temperatures
-
--temp_sigma_data_tr: Translation data distribution scaling- Default:
0.93
- Default:
-
--temp_sigma_data_rot: Rotation data distribution scaling- Default:
0.75
- Default:
-
--temp_sigma_data_tor: Torsion data distribution scaling- Default:
0.69
- Default:
Processing Options
Performance
-
--batch_size: Processing batch size- Default:
10 - Larger values increase throughput but require more memory
- Default:
-
--tqdm: Enable progress bar visualization- Useful for monitoring long-running jobs
Protein Structure
-
--chain_cutoff: Maximum number of protein chains to process- Example:
--chain_cutoff 10 - Useful for large multi-chain complexes
- Example:
-
--esm_embeddings_path: Path to pre-computed ESM2 protein embeddings- Speeds up inference by reusing embeddings
- Optional optimization
Dataset Options
--split: Dataset split to use (train/test/val)- Used for evaluation on standard benchmarks
Advanced Flags
Debugging & Testing
-
--no_model: Disable model inference (debugging)- Default:
false
- Default:
-
--no_random: Disable randomization- Default:
false - Useful for reproducibility testing
- Default:
Alternative Sampling
-
--ode: Use ODE solver instead of SDE- Default:
false - Alternative sampling approach
- Default:
-
--different_schedules: Use different noise schedules per component- Default:
false
- Default:
Error Handling
--limit_failures: Maximum allowed failures before stopping- Default:
5
- Default:
Configuration File
All parameters can be specified in a YAML configuration file (typically default_inference_args.yaml) or overridden via command line:
python -m inference --config default_inference_args.yaml --samples_per_complex 20
Command-line arguments take precedence over configuration file values.