Files
gh-k-dense-ai-claude-scient…/skills/datamol/references/conformers_module.md
2025-11-30 08:30:10 +08:00

4.9 KiB

Datamol Conformers Module Reference

The datamol.conformers module provides tools for generating and analyzing 3D molecular conformations.

Conformer Generation

dm.conformers.generate(mol, n_confs=None, rms_cutoff=None, minimize_energy=True, method='ETKDGv3', add_hs=True, ...)

Generate 3D molecular conformers.

  • Parameters:
    • mol: Input molecule
    • n_confs: Number of conformers to generate (auto-determined based on rotatable bonds if None)
    • rms_cutoff: RMS threshold in Ångströms for filtering similar conformers (removes duplicates)
    • minimize_energy: Apply UFF energy minimization (default: True)
    • method: Embedding method - options:
      • 'ETDG' - Experimental Torsion Distance Geometry
      • 'ETKDG' - ETDG with additional basic knowledge
      • 'ETKDGv2' - Enhanced version 2
      • 'ETKDGv3' - Enhanced version 3 (default, recommended)
    • add_hs: Add hydrogens before embedding (default: True, critical for quality)
    • random_seed: Set for reproducibility
  • Returns: Molecule with embedded conformers
  • Example:
    mol = dm.to_mol("CCO")
    mol_3d = dm.conformers.generate(mol, n_confs=10, rms_cutoff=0.5)
    conformers = mol_3d.GetConformers()  # Access all conformers
    

Conformer Clustering

dm.conformers.cluster(mol, rms_cutoff=1.0, already_aligned=False, centroids=False)

Group conformers by RMS distance.

  • Parameters:
    • rms_cutoff: Clustering threshold in Ångströms (default: 1.0)
    • already_aligned: Whether conformers are pre-aligned
    • centroids: Return centroid conformers (True) or cluster groups (False)
  • Returns: Cluster information or centroid conformers
  • Use case: Identify distinct conformational families

dm.conformers.return_centroids(mol, conf_clusters, centroids=True)

Extract representative conformers from clusters.

  • Parameters:
    • conf_clusters: Sequence of cluster indices from cluster()
    • centroids: Return single molecule (True) or list of molecules (False)
  • Returns: Centroid conformer(s)

Conformer Analysis

dm.conformers.rmsd(mol)

Calculate pairwise RMSD matrix across all conformers.

  • Requirements: Minimum 2 conformers
  • Returns: NxN matrix of RMSD values
  • Use case: Quantify conformer diversity

dm.conformers.sasa(mol, n_jobs=1, ...)

Calculate Solvent Accessible Surface Area (SASA) using FreeSASA.

  • Parameters:
    • n_jobs: Parallelization for multiple conformers
  • Returns: Array of SASA values (one per conformer)
  • Storage: Values stored in each conformer as property 'rdkit_free_sasa'
  • Example:
    sasa_values = dm.conformers.sasa(mol_3d)
    # Or access from conformer properties
    conf = mol_3d.GetConformer(0)
    sasa = conf.GetDoubleProp('rdkit_free_sasa')
    

Low-Level Conformer Manipulation

dm.conformers.center_of_mass(mol, conf_id=-1, use_atoms=True, round_coord=None)

Calculate molecular center.

  • Parameters:
    • conf_id: Conformer index (-1 for first conformer)
    • use_atoms: Use atomic masses (True) or geometric center (False)
    • round_coord: Decimal precision for rounding
  • Returns: 3D coordinates of center
  • Use case: Centering molecules for visualization or alignment

dm.conformers.get_coords(mol, conf_id=-1)

Retrieve atomic coordinates from a conformer.

  • Returns: Nx3 numpy array of atomic positions
  • Example:
    positions = dm.conformers.get_coords(mol_3d, conf_id=0)
    # positions.shape: (num_atoms, 3)
    

dm.conformers.translate(mol, conf_id=-1, transform_matrix=None)

Reposition conformer using transformation matrix.

  • Modification: Operates in-place
  • Use case: Aligning or repositioning molecules

Workflow Example

import datamol as dm

# 1. Create molecule and generate conformers
mol = dm.to_mol("CC(C)CCO")  # Isopentanol
mol_3d = dm.conformers.generate(
    mol,
    n_confs=50,           # Generate 50 initial conformers
    rms_cutoff=0.5,       # Filter similar conformers
    minimize_energy=True   # Minimize energy
)

# 2. Analyze conformers
n_conformers = mol_3d.GetNumConformers()
print(f"Generated {n_conformers} unique conformers")

# 3. Calculate SASA
sasa_values = dm.conformers.sasa(mol_3d)

# 4. Cluster conformers
clusters = dm.conformers.cluster(mol_3d, rms_cutoff=1.0, centroids=False)

# 5. Get representative conformers
centroids = dm.conformers.return_centroids(mol_3d, clusters)

# 6. Access 3D coordinates
coords = dm.conformers.get_coords(mol_3d, conf_id=0)

Key Concepts

  • Distance Geometry: Method for generating 3D structures from connectivity information
  • ETKDG: Uses experimental torsion angle preferences and additional chemical knowledge
  • RMS Cutoff: Lower values = more unique conformers; higher values = fewer, more distinct conformers
  • Energy Minimization: Relaxes structures to nearest local energy minimum
  • Hydrogens: Critical for accurate 3D geometry - always include during embedding