Files
gh-k-dense-ai-claude-scient…/skills/rdkit/references/descriptors_reference.md
2025-11-30 08:30:10 +08:00

12 KiB

RDKit Molecular Descriptors Reference

Complete reference for molecular descriptors available in RDKit's Descriptors module.

Usage

from rdkit import Chem
from rdkit.Chem import Descriptors

mol = Chem.MolFromSmiles('CCO')

# Calculate individual descriptor
mw = Descriptors.MolWt(mol)

# Calculate all descriptors at once
all_desc = Descriptors.CalcMolDescriptors(mol)

Molecular Weight and Mass

MolWt

Average molecular weight of the molecule.

Descriptors.MolWt(mol)

ExactMolWt

Exact molecular weight using isotopic composition.

Descriptors.ExactMolWt(mol)

HeavyAtomMolWt

Average molecular weight ignoring hydrogens.

Descriptors.HeavyAtomMolWt(mol)

Lipophilicity

MolLogP

Wildman-Crippen LogP (octanol-water partition coefficient).

Descriptors.MolLogP(mol)

MolMR

Wildman-Crippen molar refractivity.

Descriptors.MolMR(mol)

Polar Surface Area

TPSA

Topological polar surface area (TPSA) based on fragment contributions.

Descriptors.TPSA(mol)

LabuteASA

Labute's Approximate Surface Area (ASA).

Descriptors.LabuteASA(mol)

Hydrogen Bonding

NumHDonors

Number of hydrogen bond donors (N-H and O-H).

Descriptors.NumHDonors(mol)

NumHAcceptors

Number of hydrogen bond acceptors (N and O).

Descriptors.NumHAcceptors(mol)

NOCount

Number of N and O atoms.

Descriptors.NOCount(mol)

NHOHCount

Number of N-H and O-H bonds.

Descriptors.NHOHCount(mol)

Atom Counts

HeavyAtomCount

Number of heavy atoms (non-hydrogen).

Descriptors.HeavyAtomCount(mol)

NumHeteroatoms

Number of heteroatoms (non-C and non-H).

Descriptors.NumHeteroatoms(mol)

NumValenceElectrons

Total number of valence electrons.

Descriptors.NumValenceElectrons(mol)

NumRadicalElectrons

Number of radical electrons.

Descriptors.NumRadicalElectrons(mol)

Ring Descriptors

RingCount

Number of rings.

Descriptors.RingCount(mol)

NumAromaticRings

Number of aromatic rings.

Descriptors.NumAromaticRings(mol)

NumSaturatedRings

Number of saturated rings.

Descriptors.NumSaturatedRings(mol)

NumAliphaticRings

Number of aliphatic (non-aromatic) rings.

Descriptors.NumAliphaticRings(mol)

NumAromaticCarbocycles

Number of aromatic carbocycles (rings with only carbons).

Descriptors.NumAromaticCarbocycles(mol)

NumAromaticHeterocycles

Number of aromatic heterocycles (rings with heteroatoms).

Descriptors.NumAromaticHeterocycles(mol)

NumSaturatedCarbocycles

Number of saturated carbocycles.

Descriptors.NumSaturatedCarbocycles(mol)

NumSaturatedHeterocycles

Number of saturated heterocycles.

Descriptors.NumSaturatedHeterocycles(mol)

NumAliphaticCarbocycles

Number of aliphatic carbocycles.

Descriptors.NumAliphaticCarbocycles(mol)

NumAliphaticHeterocycles

Number of aliphatic heterocycles.

Descriptors.NumAliphaticHeterocycles(mol)

Rotatable Bonds

NumRotatableBonds

Number of rotatable bonds (flexibility).

Descriptors.NumRotatableBonds(mol)

Aromatic Atoms

NumAromaticAtoms

Number of aromatic atoms.

Descriptors.NumAromaticAtoms(mol)

Fraction Descriptors

FractionCsp3

Fraction of carbons that are sp3 hybridized.

Descriptors.FractionCsp3(mol)

Complexity Descriptors

BertzCT

Bertz complexity index.

Descriptors.BertzCT(mol)

Ipc

Information content (complexity measure).

Descriptors.Ipc(mol)

Kappa Shape Indices

Molecular shape descriptors based on graph invariants.

Kappa1

First kappa shape index.

Descriptors.Kappa1(mol)

Kappa2

Second kappa shape index.

Descriptors.Kappa2(mol)

Kappa3

Third kappa shape index.

Descriptors.Kappa3(mol)

Chi Connectivity Indices

Molecular connectivity indices.

Chi0, Chi1, Chi2, Chi3, Chi4

Simple chi connectivity indices.

Descriptors.Chi0(mol)
Descriptors.Chi1(mol)
Descriptors.Chi2(mol)
Descriptors.Chi3(mol)
Descriptors.Chi4(mol)

Chi0n, Chi1n, Chi2n, Chi3n, Chi4n

Valence-modified chi connectivity indices.

Descriptors.Chi0n(mol)
Descriptors.Chi1n(mol)
Descriptors.Chi2n(mol)
Descriptors.Chi3n(mol)
Descriptors.Chi4n(mol)

Chi0v, Chi1v, Chi2v, Chi3v, Chi4v

Valence chi connectivity indices.

Descriptors.Chi0v(mol)
Descriptors.Chi1v(mol)
Descriptors.Chi2v(mol)
Descriptors.Chi3v(mol)
Descriptors.Chi4v(mol)

Hall-Kier Alpha

HallKierAlpha

Hall-Kier alpha value (molecular flexibility).

Descriptors.HallKierAlpha(mol)

Balaban's J Index

BalabanJ

Balaban's J index (branching descriptor).

Descriptors.BalabanJ(mol)

EState Indices

Electrotopological state indices.

MaxEStateIndex

Maximum E-state value.

Descriptors.MaxEStateIndex(mol)

MinEStateIndex

Minimum E-state value.

Descriptors.MinEStateIndex(mol)

MaxAbsEStateIndex

Maximum absolute E-state value.

Descriptors.MaxAbsEStateIndex(mol)

MinAbsEStateIndex

Minimum absolute E-state value.

Descriptors.MinAbsEStateIndex(mol)

Partial Charges

MaxPartialCharge

Maximum partial charge.

Descriptors.MaxPartialCharge(mol)

MinPartialCharge

Minimum partial charge.

Descriptors.MinPartialCharge(mol)

MaxAbsPartialCharge

Maximum absolute partial charge.

Descriptors.MaxAbsPartialCharge(mol)

MinAbsPartialCharge

Minimum absolute partial charge.

Descriptors.MinAbsPartialCharge(mol)

Fingerprint Density

Measures the density of molecular fingerprints.

FpDensityMorgan1

Morgan fingerprint density at radius 1.

Descriptors.FpDensityMorgan1(mol)

FpDensityMorgan2

Morgan fingerprint density at radius 2.

Descriptors.FpDensityMorgan2(mol)

FpDensityMorgan3

Morgan fingerprint density at radius 3.

Descriptors.FpDensityMorgan3(mol)

PEOE VSA Descriptors

Partial Equalization of Orbital Electronegativities (PEOE) VSA descriptors.

PEOE_VSA1 through PEOE_VSA14

MOE-type descriptors using partial charges and surface area contributions.

Descriptors.PEOE_VSA1(mol)
# ... through PEOE_VSA14

SMR VSA Descriptors

Molecular refractivity VSA descriptors.

SMR_VSA1 through SMR_VSA10

MOE-type descriptors using MR contributions and surface area.

Descriptors.SMR_VSA1(mol)
# ... through SMR_VSA10

SLogP VSA Descriptors

LogP VSA descriptors.

SLogP_VSA1 through SLogP_VSA12

MOE-type descriptors using LogP contributions and surface area.

Descriptors.SLogP_VSA1(mol)
# ... through SLogP_VSA12

EState VSA Descriptors

EState_VSA1 through EState_VSA11

MOE-type descriptors using E-state indices and surface area.

Descriptors.EState_VSA1(mol)
# ... through EState_VSA11

VSA Descriptors

van der Waals surface area descriptors.

VSA_EState1 through VSA_EState10

EState VSA descriptors.

Descriptors.VSA_EState1(mol)
# ... through VSA_EState10

BCUT Descriptors

Burden-CAS-University of Texas eigenvalue descriptors.

BCUT2D_MWHI

Highest eigenvalue of Burden matrix weighted by molecular weight.

Descriptors.BCUT2D_MWHI(mol)

BCUT2D_MWLOW

Lowest eigenvalue of Burden matrix weighted by molecular weight.

Descriptors.BCUT2D_MWLOW(mol)

BCUT2D_CHGHI

Highest eigenvalue weighted by partial charges.

Descriptors.BCUT2D_CHGHI(mol)

BCUT2D_CHGLO

Lowest eigenvalue weighted by partial charges.

Descriptors.BCUT2D_CHGLO(mol)

BCUT2D_LOGPHI

Highest eigenvalue weighted by LogP.

Descriptors.BCUT2D_LOGPHI(mol)

BCUT2D_LOGPLOW

Lowest eigenvalue weighted by LogP.

Descriptors.BCUT2D_LOGPLOW(mol)

BCUT2D_MRHI

Highest eigenvalue weighted by molar refractivity.

Descriptors.BCUT2D_MRHI(mol)

BCUT2D_MRLOW

Lowest eigenvalue weighted by molar refractivity.

Descriptors.BCUT2D_MRLOW(mol)

Autocorrelation Descriptors

AUTOCORR2D

2D autocorrelation descriptors (if enabled). Various autocorrelation indices measuring spatial distribution of properties.

MQN Descriptors

Molecular Quantum Numbers - 42 simple descriptors.

mqn1 through mqn42

Integer descriptors counting various molecular features.

# Access via CalcMolDescriptors
desc = Descriptors.CalcMolDescriptors(mol)
mqns = {k: v for k, v in desc.items() if k.startswith('mqn')}

QED

qed

Quantitative Estimate of Drug-likeness.

Descriptors.qed(mol)

Lipinski's Rule of Five

Check drug-likeness using Lipinski's criteria:

def lipinski_rule_of_five(mol):
    mw = Descriptors.MolWt(mol) <= 500
    logp = Descriptors.MolLogP(mol) <= 5
    hbd = Descriptors.NumHDonors(mol) <= 5
    hba = Descriptors.NumHAcceptors(mol) <= 10
    return mw and logp and hbd and hba

Batch Descriptor Calculation

Calculate all descriptors at once:

from rdkit import Chem
from rdkit.Chem import Descriptors

mol = Chem.MolFromSmiles('CCO')

# Get all descriptors as dictionary
all_descriptors = Descriptors.CalcMolDescriptors(mol)

# Access specific descriptor
mw = all_descriptors['MolWt']
logp = all_descriptors['MolLogP']

# Get list of available descriptor names
from rdkit.Chem import Descriptors
descriptor_names = [desc[0] for desc in Descriptors._descList]

Descriptor Categories Summary

  1. Physicochemical: MolWt, MolLogP, MolMR, TPSA
  2. Topological: BertzCT, BalabanJ, Kappa indices
  3. Electronic: Partial charges, E-state indices
  4. Shape: Kappa indices, BCUT descriptors
  5. Connectivity: Chi indices
  6. 2D Fingerprints: FpDensity descriptors
  7. Atom counts: Heavy atoms, heteroatoms, rings
  8. Drug-likeness: QED, Lipinski parameters
  9. Flexibility: NumRotatableBonds, HallKierAlpha
  10. Surface area: VSA-based descriptors

Common Use Cases

Drug-likeness Screening

def screen_druglikeness(mol):
    return {
        'MW': Descriptors.MolWt(mol),
        'LogP': Descriptors.MolLogP(mol),
        'HBD': Descriptors.NumHDonors(mol),
        'HBA': Descriptors.NumHAcceptors(mol),
        'TPSA': Descriptors.TPSA(mol),
        'RotBonds': Descriptors.NumRotatableBonds(mol),
        'AromaticRings': Descriptors.NumAromaticRings(mol),
        'QED': Descriptors.qed(mol)
    }

Lead-like Filtering

def is_leadlike(mol):
    mw = 250 <= Descriptors.MolWt(mol) <= 350
    logp = Descriptors.MolLogP(mol) <= 3.5
    rot_bonds = Descriptors.NumRotatableBonds(mol) <= 7
    return mw and logp and rot_bonds

Diversity Analysis

def molecular_complexity(mol):
    return {
        'BertzCT': Descriptors.BertzCT(mol),
        'NumRings': Descriptors.RingCount(mol),
        'NumRotBonds': Descriptors.NumRotatableBonds(mol),
        'FractionCsp3': Descriptors.FractionCsp3(mol),
        'NumAromaticRings': Descriptors.NumAromaticRings(mol)
    }

Tips

  1. Use batch calculation for multiple descriptors to avoid redundant computations
  2. Check for None - some descriptors may return None for invalid molecules
  3. Normalize descriptors for machine learning applications
  4. Select relevant descriptors - not all 200+ descriptors are useful for every task
  5. Consider 3D descriptors separately (require 3D coordinates)
  6. Validate ranges - check if descriptor values are in expected ranges