12 KiB
Medchem API Reference
Comprehensive reference for all medchem modules and functions.
Module: medchem.rules
Class: RuleFilters
Filter molecules based on multiple medicinal chemistry rules.
Constructor:
RuleFilters(rule_list: List[str])
Parameters:
rule_list: List of rule names to apply. See available rules below.
Methods:
__call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> Dict
mols: List of RDKit molecule objectsn_jobs: Number of parallel jobs (-1 uses all cores)progress: Show progress bar- Returns: Dictionary with results for each rule
Example:
rfilter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_cns"])
results = rfilter(mols=mol_list, n_jobs=-1, progress=True)
Module: medchem.rules.basic_rules
Individual rule functions that can be applied to single molecules.
rule_of_five()
rule_of_five(mol: Union[str, Chem.Mol]) -> bool
Lipinski's Rule of Five for oral bioavailability.
Criteria:
- Molecular weight ≤ 500 Da
- LogP ≤ 5
- H-bond donors ≤ 5
- H-bond acceptors ≤ 10
Parameters:
mol: SMILES string or RDKit molecule object
Returns: True if molecule passes all criteria
rule_of_three()
rule_of_three(mol: Union[str, Chem.Mol]) -> bool
Rule of Three for fragment screening libraries.
Criteria:
- Molecular weight ≤ 300 Da
- LogP ≤ 3
- H-bond donors ≤ 3
- H-bond acceptors ≤ 3
- Rotatable bonds ≤ 3
- Polar surface area ≤ 60 Ų
rule_of_oprea()
rule_of_oprea(mol: Union[str, Chem.Mol]) -> bool
Oprea's lead-like criteria for hit-to-lead optimization.
Criteria:
- Molecular weight: 200-350 Da
- LogP: -2 to 4
- Rotatable bonds ≤ 7
- Rings ≤ 4
rule_of_cns()
rule_of_cns(mol: Union[str, Chem.Mol]) -> bool
CNS drug-likeness rules.
Criteria:
- Molecular weight ≤ 450 Da
- LogP: -1 to 5
- H-bond donors ≤ 2
- TPSA ≤ 90 Ų
rule_of_leadlike_soft()
rule_of_leadlike_soft(mol: Union[str, Chem.Mol]) -> bool
Soft lead-like criteria (more permissive).
Criteria:
- Molecular weight: 250-450 Da
- LogP: -3 to 4
- Rotatable bonds ≤ 10
rule_of_leadlike_strict()
rule_of_leadlike_strict(mol: Union[str, Chem.Mol]) -> bool
Strict lead-like criteria (more restrictive).
Criteria:
- Molecular weight: 200-350 Da
- LogP: -2 to 3.5
- Rotatable bonds ≤ 7
- Rings: 1-3
rule_of_veber()
rule_of_veber(mol: Union[str, Chem.Mol]) -> bool
Veber's rules for oral bioavailability.
Criteria:
- Rotatable bonds ≤ 10
- TPSA ≤ 140 Ų
rule_of_reos()
rule_of_reos(mol: Union[str, Chem.Mol]) -> bool
Rapid Elimination Of Swill (REOS) filter.
Criteria:
- Molecular weight: 200-500 Da
- LogP: -5 to 5
- H-bond donors: 0-5
- H-bond acceptors: 0-10
rule_of_drug()
rule_of_drug(mol: Union[str, Chem.Mol]) -> bool
Combined drug-likeness criteria.
Criteria:
- Passes Rule of Five
- Passes Veber rules
- No PAINS substructures
golden_triangle()
golden_triangle(mol: Union[str, Chem.Mol]) -> bool
Golden Triangle for drug-likeness balance.
Criteria:
- 200 ≤ MW ≤ 50×LogP + 400
- LogP: -2 to 5
pains_filter()
pains_filter(mol: Union[str, Chem.Mol]) -> bool
Pan Assay INterference compoundS (PAINS) filter.
Returns: True if molecule does NOT contain PAINS substructures
Module: medchem.structural
Class: CommonAlertsFilters
Filter for common structural alerts derived from ChEMBL and literature.
Constructor:
CommonAlertsFilters()
Methods:
__call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> List[Dict]
Apply common alerts filter to a list of molecules.
Returns: List of dictionaries with keys:
has_alerts: Boolean indicating if molecule has alertsalert_details: List of matched alert patternsnum_alerts: Number of alerts found
check_mol(mol: Chem.Mol) -> Tuple[bool, List[str]]
Check a single molecule for structural alerts.
Returns: Tuple of (has_alerts, list_of_alert_names)
Class: NIBRFilters
Novartis NIBR medicinal chemistry filters.
Constructor:
NIBRFilters()
Methods:
__call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> List[bool]
Apply NIBR filters to molecules.
Returns: List of booleans (True if molecule passes)
Class: LillyDemeritsFilters
Eli Lilly's demerit-based structural alert system (275 rules).
Constructor:
LillyDemeritsFilters()
Methods:
__call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> List[Dict]
Calculate Lilly demerits for molecules.
Returns: List of dictionaries with keys:
demerits: Total demerit scorepasses: Boolean (True if demerits ≤ 100)matched_patterns: List of matched patterns with scores
Module: medchem.functional
High-level functional API for common operations.
nibr_filter()
nibr_filter(mols: List[Chem.Mol], n_jobs: int = 1) -> List[bool]
Apply NIBR filters using functional API.
Parameters:
mols: List of moleculesn_jobs: Parallelization level
Returns: List of pass/fail booleans
common_alerts_filter()
common_alerts_filter(mols: List[Chem.Mol], n_jobs: int = 1) -> List[Dict]
Apply common alerts filter using functional API.
Returns: List of results dictionaries
lilly_demerits_filter()
lilly_demerits_filter(mols: List[Chem.Mol], n_jobs: int = 1) -> List[Dict]
Calculate Lilly demerits using functional API.
Module: medchem.groups
Class: ChemicalGroup
Detect specific chemical groups in molecules.
Constructor:
ChemicalGroup(groups: List[str], custom_smarts: Optional[Dict[str, str]] = None)
Parameters:
groups: List of predefined group namescustom_smarts: Dictionary mapping custom group names to SMARTS patterns
Predefined Groups:
"hinge_binders": Kinase hinge binding motifs"phosphate_binders": Phosphate binding groups"michael_acceptors": Michael acceptor electrophiles"reactive_groups": General reactive functionalities
Methods:
has_match(mols: List[Chem.Mol]) -> List[bool]
Check if molecules contain any of the specified groups.
get_matches(mol: Chem.Mol) -> Dict[str, List[Tuple]]
Get detailed match information for a single molecule.
Returns: Dictionary mapping group names to lists of atom indices
get_all_matches(mols: List[Chem.Mol]) -> List[Dict]
Get match information for all molecules.
Example:
group = mc.groups.ChemicalGroup(groups=["hinge_binders", "phosphate_binders"])
matches = group.get_all_matches(mol_list)
Module: medchem.catalogs
Class: NamedCatalogs
Access to curated chemical catalogs.
Available Catalogs:
"functional_groups": Common functional groups"protecting_groups": Protecting group structures"reagents": Common reagents"fragments": Standard fragments
Usage:
catalog = mc.catalogs.NamedCatalogs.get("functional_groups")
matches = catalog.get_matches(mol)
Module: medchem.complexity
Calculate molecular complexity metrics.
calculate_complexity()
calculate_complexity(mol: Chem.Mol, method: str = "bertz") -> float
Calculate complexity score for a molecule.
Parameters:
mol: RDKit moleculemethod: Complexity metric ("bertz", "whitlock", "barone")
Returns: Complexity score (higher = more complex)
Class: ComplexityFilter
Filter molecules by complexity threshold.
Constructor:
ComplexityFilter(max_complexity: float, method: str = "bertz")
Methods:
__call__(mols: List[Chem.Mol], n_jobs: int = 1) -> List[bool]
Filter molecules exceeding complexity threshold.
Module: medchem.constraints
Class: Constraints
Apply custom property-based constraints.
Constructor:
Constraints(
mw_range: Optional[Tuple[float, float]] = None,
logp_range: Optional[Tuple[float, float]] = None,
tpsa_max: Optional[float] = None,
tpsa_range: Optional[Tuple[float, float]] = None,
hbd_max: Optional[int] = None,
hba_max: Optional[int] = None,
rotatable_bonds_max: Optional[int] = None,
rings_range: Optional[Tuple[int, int]] = None,
aromatic_rings_max: Optional[int] = None,
)
Parameters: All parameters are optional. Specify only the constraints needed.
Methods:
__call__(mols: List[Chem.Mol], n_jobs: int = 1) -> List[Dict]
Apply constraints to molecules.
Returns: List of dictionaries with keys:
passes: Boolean indicating if all constraints passviolations: List of constraint names that failed
Example:
constraints = mc.constraints.Constraints(
mw_range=(200, 500),
logp_range=(-2, 5),
tpsa_max=140
)
results = constraints(mols=mol_list, n_jobs=-1)
Module: medchem.query
Query language for complex filtering.
parse()
parse(query: str) -> Query
Parse a medchem query string into a Query object.
Query Syntax:
- Operators:
AND,OR,NOT - Comparisons:
<,>,<=,>=,==,!= - Properties:
complexity,lilly_demerits,mw,logp,tpsa - Rules:
rule_of_five,rule_of_cns, etc. - Filters:
common_alerts,nibr_filter,pains_filter
Example Queries:
"rule_of_five AND NOT common_alerts"
"rule_of_cns AND complexity < 400"
"mw > 200 AND mw < 500 AND logp < 5"
"(rule_of_five OR rule_of_oprea) AND NOT pains_filter"
Class: Query
Methods:
apply(mols: List[Chem.Mol], n_jobs: int = 1) -> List[bool]
Apply parsed query to molecules.
Example:
query = mc.query.parse("rule_of_five AND NOT common_alerts")
results = query.apply(mols=mol_list, n_jobs=-1)
passing_mols = [mol for mol, passes in zip(mol_list, results) if passes]
Module: medchem.utils
Utility functions for working with molecules.
batch_process()
batch_process(
mols: List[Chem.Mol],
func: Callable,
n_jobs: int = 1,
progress: bool = False,
batch_size: Optional[int] = None
) -> List
Process molecules in parallel batches.
Parameters:
mols: List of moleculesfunc: Function to apply to each moleculen_jobs: Number of parallel workersprogress: Show progress barbatch_size: Size of processing batches
standardize_mol()
standardize_mol(mol: Chem.Mol) -> Chem.Mol
Standardize molecule representation (sanitize, neutralize charges, etc.).
Common Patterns
Pattern: Parallel Processing
All filters support parallelization:
# Use all CPU cores
results = filter_object(mols=mol_list, n_jobs=-1, progress=True)
# Use specific number of cores
results = filter_object(mols=mol_list, n_jobs=4, progress=True)
Pattern: Combining Multiple Filters
import medchem as mc
# Apply multiple filters
rule_filter = mc.rules.RuleFilters(rule_list=["rule_of_five"])
alert_filter = mc.structural.CommonAlertsFilters()
lilly_filter = mc.structural.LillyDemeritsFilters()
# Get results
rule_results = rule_filter(mols=mol_list, n_jobs=-1)
alert_results = alert_filter(mols=mol_list, n_jobs=-1)
lilly_results = lilly_filter(mols=mol_list, n_jobs=-1)
# Combine criteria
passing_mols = [
mol for i, mol in enumerate(mol_list)
if rule_results[i]["passes"]
and not alert_results[i]["has_alerts"]
and lilly_results[i]["passes"]
]
Pattern: Working with DataFrames
import pandas as pd
import datamol as dm
import medchem as mc
# Load data
df = pd.read_csv("molecules.csv")
df["mol"] = df["smiles"].apply(dm.to_mol)
# Apply filters
rfilter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_cns"])
results = rfilter(mols=df["mol"].tolist(), n_jobs=-1)
# Add results to dataframe
df["passes_ro5"] = [r["rule_of_five"] for r in results]
df["passes_cns"] = [r["rule_of_cns"] for r in results]
# Filter dataframe
filtered_df = df[df["passes_ro5"] & df["passes_cns"]]