Files
gh-k-dense-ai-claude-scient…/skills/rdkit/references/api_reference.md
2025-11-30 08:30:10 +08:00

16 KiB

RDKit API Reference

This document provides a comprehensive reference for RDKit's Python API, organized by functionality.

Core Module: rdkit.Chem

The fundamental module for working with molecules.

Molecule I/O

Reading Molecules:

  • Chem.MolFromSmiles(smiles, sanitize=True) - Parse SMILES string
  • Chem.MolFromSmarts(smarts) - Parse SMARTS pattern
  • Chem.MolFromMolFile(filename, sanitize=True, removeHs=True) - Read MOL file
  • Chem.MolFromMolBlock(molblock, sanitize=True, removeHs=True) - Parse MOL block string
  • Chem.MolFromMol2File(filename, sanitize=True, removeHs=True) - Read MOL2 file
  • Chem.MolFromMol2Block(molblock, sanitize=True, removeHs=True) - Parse MOL2 block
  • Chem.MolFromPDBFile(filename, sanitize=True, removeHs=True) - Read PDB file
  • Chem.MolFromPDBBlock(pdbblock, sanitize=True, removeHs=True) - Parse PDB block
  • Chem.MolFromInchi(inchi, sanitize=True, removeHs=True) - Parse InChI string
  • Chem.MolFromSequence(seq, sanitize=True) - Create molecule from peptide sequence

Writing Molecules:

  • Chem.MolToSmiles(mol, isomericSmiles=True, canonical=True) - Convert to SMILES
  • Chem.MolToSmarts(mol, isomericSmarts=False) - Convert to SMARTS
  • Chem.MolToMolBlock(mol, includeStereo=True, confId=-1) - Convert to MOL block
  • Chem.MolToMolFile(mol, filename, includeStereo=True, confId=-1) - Write MOL file
  • Chem.MolToPDBBlock(mol, confId=-1) - Convert to PDB block
  • Chem.MolToPDBFile(mol, filename, confId=-1) - Write PDB file
  • Chem.MolToInchi(mol, options='') - Convert to InChI
  • Chem.MolToInchiKey(mol, options='') - Generate InChI key
  • Chem.MolToSequence(mol) - Convert to peptide sequence

Batch I/O:

  • Chem.SDMolSupplier(filename, sanitize=True, removeHs=True) - SDF file reader
  • Chem.ForwardSDMolSupplier(fileobj, sanitize=True, removeHs=True) - Forward-only SDF reader
  • Chem.MultithreadedSDMolSupplier(filename, numWriterThreads=1) - Parallel SDF reader
  • Chem.SmilesMolSupplier(filename, delimiter=' ', titleLine=True) - SMILES file reader
  • Chem.SDWriter(filename) - SDF file writer
  • Chem.SmilesWriter(filename, delimiter=' ', includeHeader=True) - SMILES file writer

Molecular Manipulation

Sanitization:

  • Chem.SanitizeMol(mol, sanitizeOps=SANITIZE_ALL, catchErrors=False) - Sanitize molecule
  • Chem.DetectChemistryProblems(mol, sanitizeOps=SANITIZE_ALL) - Detect sanitization issues
  • Chem.AssignStereochemistry(mol, cleanIt=True, force=False) - Assign stereochemistry
  • Chem.FindPotentialStereo(mol) - Find potential stereocenters
  • Chem.AssignStereochemistryFrom3D(mol, confId=-1) - Assign stereo from 3D coords

Hydrogen Management:

  • Chem.AddHs(mol, explicitOnly=False, addCoords=False) - Add explicit hydrogens
  • Chem.RemoveHs(mol, implicitOnly=False, updateExplicitCount=False) - Remove hydrogens
  • Chem.RemoveAllHs(mol) - Remove all hydrogens

Aromaticity:

  • Chem.SetAromaticity(mol, model=AROMATICITY_RDKIT) - Set aromaticity model
  • Chem.Kekulize(mol, clearAromaticFlags=False) - Kekulize aromatic bonds
  • Chem.SetConjugation(mol) - Set conjugation flags

Fragments:

  • Chem.GetMolFrags(mol, asMols=False, sanitizeFrags=True) - Get disconnected fragments
  • Chem.FragmentOnBonds(mol, bondIndices, addDummies=True) - Fragment on specific bonds
  • Chem.ReplaceSubstructs(mol, query, replacement, replaceAll=False) - Replace substructures
  • Chem.DeleteSubstructs(mol, query, onlyFrags=False) - Delete substructures

Stereochemistry:

  • Chem.FindMolChiralCenters(mol, includeUnassigned=False, useLegacyImplementation=False) - Find chiral centers
  • Chem.FindPotentialStereo(mol, cleanIt=True) - Find potential stereocenters

Substructure Searching

Basic Matching:

  • mol.HasSubstructMatch(query, useChirality=False) - Check for substructure match
  • mol.GetSubstructMatch(query, useChirality=False) - Get first match
  • mol.GetSubstructMatches(query, uniquify=True, useChirality=False) - Get all matches
  • mol.GetSubstructMatches(query, maxMatches=1000) - Limit number of matches

Molecular Properties

Atom Methods:

  • atom.GetSymbol() - Atomic symbol
  • atom.GetAtomicNum() - Atomic number
  • atom.GetDegree() - Number of bonds
  • atom.GetTotalDegree() - Including hydrogens
  • atom.GetFormalCharge() - Formal charge
  • atom.GetNumRadicalElectrons() - Radical electrons
  • atom.GetIsAromatic() - Aromaticity flag
  • atom.GetHybridization() - Hybridization (SP, SP2, SP3, etc.)
  • atom.GetIdx() - Atom index
  • atom.IsInRing() - In any ring
  • atom.IsInRingSize(size) - In ring of specific size
  • atom.GetChiralTag() - Chirality tag

Bond Methods:

  • bond.GetBondType() - Bond type (SINGLE, DOUBLE, TRIPLE, AROMATIC)
  • bond.GetBeginAtomIdx() - Starting atom index
  • bond.GetEndAtomIdx() - Ending atom index
  • bond.GetIsConjugated() - Conjugation flag
  • bond.GetIsAromatic() - Aromaticity flag
  • bond.IsInRing() - In any ring
  • bond.GetStereo() - Stereochemistry (STEREONONE, STEREOZ, STEREOE, etc.)

Molecule Methods:

  • mol.GetNumAtoms(onlyExplicit=True) - Number of atoms
  • mol.GetNumHeavyAtoms() - Number of heavy atoms
  • mol.GetNumBonds() - Number of bonds
  • mol.GetAtoms() - Iterator over atoms
  • mol.GetBonds() - Iterator over bonds
  • mol.GetAtomWithIdx(idx) - Get specific atom
  • mol.GetBondWithIdx(idx) - Get specific bond
  • mol.GetRingInfo() - Ring information object

Ring Information:

  • Chem.GetSymmSSSR(mol) - Get smallest set of smallest rings
  • Chem.GetSSSR(mol) - Alias for GetSymmSSSR
  • ring_info.NumRings() - Number of rings
  • ring_info.AtomRings() - Tuples of atom indices in rings
  • ring_info.BondRings() - Tuples of bond indices in rings

rdkit.Chem.AllChem

Extended chemistry functionality.

2D/3D Coordinate Generation

  • AllChem.Compute2DCoords(mol, canonOrient=True, clearConfs=True) - Generate 2D coordinates
  • AllChem.EmbedMolecule(mol, maxAttempts=0, randomSeed=-1, useRandomCoords=False) - Generate 3D conformer
  • AllChem.EmbedMultipleConfs(mol, numConfs=10, maxAttempts=0, randomSeed=-1) - Generate multiple conformers
  • AllChem.ConstrainedEmbed(mol, core, useTethers=True) - Constrained embedding
  • AllChem.GenerateDepictionMatching2DStructure(mol, reference, refPattern=None) - Align to template

Force Field Optimization

  • AllChem.UFFOptimizeMolecule(mol, maxIters=200, confId=-1) - UFF optimization
  • AllChem.MMFFOptimizeMolecule(mol, maxIters=200, confId=-1, mmffVariant='MMFF94') - MMFF optimization
  • AllChem.UFFGetMoleculeForceField(mol, confId=-1) - Get UFF force field object
  • AllChem.MMFFGetMoleculeForceField(mol, pyMMFFMolProperties, confId=-1) - Get MMFF force field

Conformer Analysis

  • AllChem.GetConformerRMS(mol, confId1, confId2, prealigned=False) - Calculate RMSD
  • AllChem.GetConformerRMSMatrix(mol, prealigned=False) - RMSD matrix
  • AllChem.AlignMol(prbMol, refMol, prbCid=-1, refCid=-1) - Align molecules
  • AllChem.AlignMolConformers(mol) - Align all conformers

Reactions

  • AllChem.ReactionFromSmarts(smarts, useSmiles=False) - Create reaction from SMARTS
  • reaction.RunReactants(reactants) - Apply reaction
  • reaction.RunReactant(reactant, reactionIdx) - Apply to specific reactant
  • AllChem.CreateDifferenceFingerprintForReaction(reaction) - Reaction fingerprint

Fingerprints

  • AllChem.GetMorganFingerprint(mol, radius, useFeatures=False) - Morgan fingerprint
  • AllChem.GetMorganFingerprintAsBitVect(mol, radius, nBits=2048) - Morgan bit vector
  • AllChem.GetHashedMorganFingerprint(mol, radius, nBits=2048) - Hashed Morgan
  • AllChem.GetErGFingerprint(mol) - ErG fingerprint

rdkit.Chem.Descriptors

Molecular descriptor calculations.

Common Descriptors

  • Descriptors.MolWt(mol) - Molecular weight
  • Descriptors.ExactMolWt(mol) - Exact molecular weight
  • Descriptors.HeavyAtomMolWt(mol) - Heavy atom molecular weight
  • Descriptors.MolLogP(mol) - LogP (lipophilicity)
  • Descriptors.MolMR(mol) - Molar refractivity
  • Descriptors.TPSA(mol) - Topological polar surface area
  • Descriptors.NumHDonors(mol) - Hydrogen bond donors
  • Descriptors.NumHAcceptors(mol) - Hydrogen bond acceptors
  • Descriptors.NumRotatableBonds(mol) - Rotatable bonds
  • Descriptors.NumAromaticRings(mol) - Aromatic rings
  • Descriptors.NumSaturatedRings(mol) - Saturated rings
  • Descriptors.NumAliphaticRings(mol) - Aliphatic rings
  • Descriptors.NumAromaticHeterocycles(mol) - Aromatic heterocycles
  • Descriptors.NumRadicalElectrons(mol) - Radical electrons
  • Descriptors.NumValenceElectrons(mol) - Valence electrons

Batch Calculation

  • Descriptors.CalcMolDescriptors(mol) - Calculate all descriptors as dictionary

Descriptor Lists

  • Descriptors._descList - List of (name, function) tuples for all descriptors

rdkit.Chem.Draw

Molecular visualization.

Image Generation

  • Draw.MolToImage(mol, size=(300,300), kekulize=True, wedgeBonds=True, highlightAtoms=None) - Generate PIL image
  • Draw.MolToFile(mol, filename, size=(300,300), kekulize=True, wedgeBonds=True) - Save to file
  • Draw.MolsToGridImage(mols, molsPerRow=3, subImgSize=(200,200), legends=None) - Grid of molecules
  • Draw.MolsMatrixToGridImage(mols, molsPerRow=3, subImgSize=(200,200), legends=None) - Nested grid
  • Draw.ReactionToImage(rxn, subImgSize=(200,200)) - Reaction image

Fingerprint Visualization

  • Draw.DrawMorganBit(mol, bitId, bitInfo, whichExample=0) - Visualize Morgan bit
  • Draw.DrawMorganBits(bits, mol, bitInfo, molsPerRow=3) - Multiple Morgan bits
  • Draw.DrawRDKitBit(mol, bitId, bitInfo, whichExample=0) - Visualize RDKit bit

IPython Integration

  • Draw.IPythonConsole - Module for Jupyter integration
  • Draw.IPythonConsole.ipython_useSVG - Use SVG (True) or PNG (False)
  • Draw.IPythonConsole.molSize - Default molecule image size

Drawing Options

  • rdMolDraw2D.MolDrawOptions() - Get drawing options object
    • .addAtomIndices - Show atom indices
    • .addBondIndices - Show bond indices
    • .addStereoAnnotation - Show stereochemistry
    • .bondLineWidth - Line width
    • .highlightBondWidthMultiplier - Highlight width
    • .minFontSize - Minimum font size
    • .maxFontSize - Maximum font size

rdkit.Chem.rdMolDescriptors

Additional descriptor calculations.

  • rdMolDescriptors.CalcNumRings(mol) - Number of rings
  • rdMolDescriptors.CalcNumAromaticRings(mol) - Aromatic rings
  • rdMolDescriptors.CalcNumAliphaticRings(mol) - Aliphatic rings
  • rdMolDescriptors.CalcNumSaturatedRings(mol) - Saturated rings
  • rdMolDescriptors.CalcNumHeterocycles(mol) - Heterocycles
  • rdMolDescriptors.CalcNumAromaticHeterocycles(mol) - Aromatic heterocycles
  • rdMolDescriptors.CalcNumSpiroAtoms(mol) - Spiro atoms
  • rdMolDescriptors.CalcNumBridgeheadAtoms(mol) - Bridgehead atoms
  • rdMolDescriptors.CalcFractionCsp3(mol) - Fraction of sp3 carbons
  • rdMolDescriptors.CalcLabuteASA(mol) - Labute accessible surface area
  • rdMolDescriptors.CalcTPSA(mol) - TPSA
  • rdMolDescriptors.CalcMolFormula(mol) - Molecular formula

rdkit.Chem.Scaffolds

Scaffold analysis.

Murcko Scaffolds

  • MurckoScaffold.GetScaffoldForMol(mol) - Get Murcko scaffold
  • MurckoScaffold.MakeScaffoldGeneric(mol) - Generic scaffold
  • MurckoScaffold.MurckoDecompose(mol) - Decompose to scaffold and sidechains

rdkit.Chem.rdMolHash

Molecular hashing and standardization.

  • rdMolHash.MolHash(mol, hashFunction) - Generate hash
    • rdMolHash.HashFunction.AnonymousGraph - Anonymized structure
    • rdMolHash.HashFunction.CanonicalSmiles - Canonical SMILES
    • rdMolHash.HashFunction.ElementGraph - Element graph
    • rdMolHash.HashFunction.MurckoScaffold - Murcko scaffold
    • rdMolHash.HashFunction.Regioisomer - Regioisomer (no stereo)
    • rdMolHash.HashFunction.NetCharge - Net charge
    • rdMolHash.HashFunction.HetAtomProtomer - Heteroatom protomer
    • rdMolHash.HashFunction.HetAtomTautomer - Heteroatom tautomer

rdkit.Chem.MolStandardize

Molecule standardization.

  • rdMolStandardize.Normalize(mol) - Normalize functional groups
  • rdMolStandardize.Reionize(mol) - Fix ionization state
  • rdMolStandardize.RemoveFragments(mol) - Remove small fragments
  • rdMolStandardize.Cleanup(mol) - Full cleanup (normalize + reionize + remove)
  • rdMolStandardize.Uncharger() - Create uncharger object
    • .uncharge(mol) - Remove charges
  • rdMolStandardize.TautomerEnumerator() - Enumerate tautomers
    • .Enumerate(mol) - Generate tautomers
    • .Canonicalize(mol) - Get canonical tautomer

rdkit.DataStructs

Fingerprint similarity and operations.

Similarity Metrics

  • DataStructs.TanimotoSimilarity(fp1, fp2) - Tanimoto coefficient
  • DataStructs.DiceSimilarity(fp1, fp2) - Dice coefficient
  • DataStructs.CosineSimilarity(fp1, fp2) - Cosine similarity
  • DataStructs.SokalSimilarity(fp1, fp2) - Sokal similarity
  • DataStructs.KulczynskiSimilarity(fp1, fp2) - Kulczynski similarity
  • DataStructs.McConnaugheySimilarity(fp1, fp2) - McConnaughey similarity

Bulk Operations

  • DataStructs.BulkTanimotoSimilarity(fp, fps) - Tanimoto for list of fingerprints
  • DataStructs.BulkDiceSimilarity(fp, fps) - Dice for list
  • DataStructs.BulkCosineSimilarity(fp, fps) - Cosine for list

Distance Metrics

  • DataStructs.TanimotoDistance(fp1, fp2) - 1 - Tanimoto
  • DataStructs.DiceDistance(fp1, fp2) - 1 - Dice

rdkit.Chem.AtomPairs

Atom pair fingerprints.

  • Pairs.GetAtomPairFingerprint(mol, minLength=1, maxLength=30) - Atom pair fingerprint
  • Pairs.GetAtomPairFingerprintAsBitVect(mol, minLength=1, maxLength=30, nBits=2048) - As bit vector
  • Pairs.GetHashedAtomPairFingerprint(mol, nBits=2048, minLength=1, maxLength=30) - Hashed version

rdkit.Chem.Torsions

Topological torsion fingerprints.

  • Torsions.GetTopologicalTorsionFingerprint(mol, targetSize=4) - Torsion fingerprint
  • Torsions.GetTopologicalTorsionFingerprintAsIntVect(mol, targetSize=4) - As int vector
  • Torsions.GetHashedTopologicalTorsionFingerprint(mol, nBits=2048, targetSize=4) - Hashed version

rdkit.Chem.MACCSkeys

MACCS structural keys.

  • MACCSkeys.GenMACCSKeys(mol) - Generate 166-bit MACCS keys

rdkit.Chem.ChemicalFeatures

Pharmacophore features.

  • ChemicalFeatures.BuildFeatureFactory(featureFile) - Create feature factory
  • factory.GetFeaturesForMol(mol) - Get pharmacophore features
  • feature.GetFamily() - Feature family (Donor, Acceptor, etc.)
  • feature.GetType() - Feature type
  • feature.GetAtomIds() - Atoms involved in feature

rdkit.ML.Cluster.Butina

Clustering algorithms.

  • Butina.ClusterData(distances, nPts, distThresh, isDistData=True) - Butina clustering
    • Returns tuple of tuples with cluster members

rdkit.Chem.rdFingerprintGenerator

Modern fingerprint generation API (RDKit 2020.09+).

  • rdFingerprintGenerator.GetMorganGenerator(radius=2, fpSize=2048) - Morgan generator
  • rdFingerprintGenerator.GetRDKitFPGenerator(minPath=1, maxPath=7, fpSize=2048) - RDKit FP generator
  • rdFingerprintGenerator.GetAtomPairGenerator(minDistance=1, maxDistance=30) - Atom pair generator
  • generator.GetFingerprint(mol) - Generate fingerprint
  • generator.GetCountFingerprint(mol) - Count-based fingerprint

Common Parameters

Sanitization Operations

  • SANITIZE_NONE - No sanitization
  • SANITIZE_ALL - All operations (default)
  • SANITIZE_CLEANUP - Basic cleanup
  • SANITIZE_PROPERTIES - Calculate properties
  • SANITIZE_SYMMRINGS - Symmetrize rings
  • SANITIZE_KEKULIZE - Kekulize aromatic rings
  • SANITIZE_FINDRADICALS - Find radical electrons
  • SANITIZE_SETAROMATICITY - Set aromaticity
  • SANITIZE_SETCONJUGATION - Set conjugation
  • SANITIZE_SETHYBRIDIZATION - Set hybridization
  • SANITIZE_CLEANUPCHIRALITY - Cleanup chirality

Bond Types

  • BondType.SINGLE - Single bond
  • BondType.DOUBLE - Double bond
  • BondType.TRIPLE - Triple bond
  • BondType.AROMATIC - Aromatic bond
  • BondType.DATIVE - Dative bond
  • BondType.UNSPECIFIED - Unspecified

Hybridization

  • HybridizationType.S - S
  • HybridizationType.SP - SP
  • HybridizationType.SP2 - SP2
  • HybridizationType.SP3 - SP3
  • HybridizationType.SP3D - SP3D
  • HybridizationType.SP3D2 - SP3D2

Chirality

  • ChiralType.CHI_UNSPECIFIED - Unspecified
  • ChiralType.CHI_TETRAHEDRAL_CW - Clockwise
  • ChiralType.CHI_TETRAHEDRAL_CCW - Counter-clockwise

Installation

# Using conda (recommended)
conda install -c conda-forge rdkit

# Using pip
pip install rdkit-pypi

Importing

# Core functionality
from rdkit import Chem
from rdkit.Chem import AllChem

# Descriptors
from rdkit.Chem import Descriptors

# Drawing
from rdkit.Chem import Draw

# Similarity
from rdkit import DataStructs