# Medchem API Reference Comprehensive reference for all medchem modules and functions. ## Module: medchem.rules ### Class: RuleFilters Filter molecules based on multiple medicinal chemistry rules. **Constructor:** ```python RuleFilters(rule_list: List[str]) ``` **Parameters:** - `rule_list`: List of rule names to apply. See available rules below. **Methods:** ```python __call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> Dict ``` - `mols`: List of RDKit molecule objects - `n_jobs`: Number of parallel jobs (-1 uses all cores) - `progress`: Show progress bar - **Returns**: Dictionary with results for each rule **Example:** ```python rfilter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_cns"]) results = rfilter(mols=mol_list, n_jobs=-1, progress=True) ``` ### Module: medchem.rules.basic_rules Individual rule functions that can be applied to single molecules. #### rule_of_five() ```python rule_of_five(mol: Union[str, Chem.Mol]) -> bool ``` Lipinski's Rule of Five for oral bioavailability. **Criteria:** - Molecular weight ≤ 500 Da - LogP ≤ 5 - H-bond donors ≤ 5 - H-bond acceptors ≤ 10 **Parameters:** - `mol`: SMILES string or RDKit molecule object **Returns:** True if molecule passes all criteria #### rule_of_three() ```python rule_of_three(mol: Union[str, Chem.Mol]) -> bool ``` Rule of Three for fragment screening libraries. **Criteria:** - Molecular weight ≤ 300 Da - LogP ≤ 3 - H-bond donors ≤ 3 - H-bond acceptors ≤ 3 - Rotatable bonds ≤ 3 - Polar surface area ≤ 60 Ų #### rule_of_oprea() ```python rule_of_oprea(mol: Union[str, Chem.Mol]) -> bool ``` Oprea's lead-like criteria for hit-to-lead optimization. **Criteria:** - Molecular weight: 200-350 Da - LogP: -2 to 4 - Rotatable bonds ≤ 7 - Rings ≤ 4 #### rule_of_cns() ```python rule_of_cns(mol: Union[str, Chem.Mol]) -> bool ``` CNS drug-likeness rules. **Criteria:** - Molecular weight ≤ 450 Da - LogP: -1 to 5 - H-bond donors ≤ 2 - TPSA ≤ 90 Ų #### rule_of_leadlike_soft() ```python rule_of_leadlike_soft(mol: Union[str, Chem.Mol]) -> bool ``` Soft lead-like criteria (more permissive). **Criteria:** - Molecular weight: 250-450 Da - LogP: -3 to 4 - Rotatable bonds ≤ 10 #### rule_of_leadlike_strict() ```python rule_of_leadlike_strict(mol: Union[str, Chem.Mol]) -> bool ``` Strict lead-like criteria (more restrictive). **Criteria:** - Molecular weight: 200-350 Da - LogP: -2 to 3.5 - Rotatable bonds ≤ 7 - Rings: 1-3 #### rule_of_veber() ```python rule_of_veber(mol: Union[str, Chem.Mol]) -> bool ``` Veber's rules for oral bioavailability. **Criteria:** - Rotatable bonds ≤ 10 - TPSA ≤ 140 Ų #### rule_of_reos() ```python rule_of_reos(mol: Union[str, Chem.Mol]) -> bool ``` Rapid Elimination Of Swill (REOS) filter. **Criteria:** - Molecular weight: 200-500 Da - LogP: -5 to 5 - H-bond donors: 0-5 - H-bond acceptors: 0-10 #### rule_of_drug() ```python rule_of_drug(mol: Union[str, Chem.Mol]) -> bool ``` Combined drug-likeness criteria. **Criteria:** - Passes Rule of Five - Passes Veber rules - No PAINS substructures #### golden_triangle() ```python golden_triangle(mol: Union[str, Chem.Mol]) -> bool ``` Golden Triangle for drug-likeness balance. **Criteria:** - 200 ≤ MW ≤ 50×LogP + 400 - LogP: -2 to 5 #### pains_filter() ```python pains_filter(mol: Union[str, Chem.Mol]) -> bool ``` Pan Assay INterference compoundS (PAINS) filter. **Returns:** True if molecule does NOT contain PAINS substructures --- ## Module: medchem.structural ### Class: CommonAlertsFilters Filter for common structural alerts derived from ChEMBL and literature. **Constructor:** ```python CommonAlertsFilters() ``` **Methods:** ```python __call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> List[Dict] ``` Apply common alerts filter to a list of molecules. **Returns:** List of dictionaries with keys: - `has_alerts`: Boolean indicating if molecule has alerts - `alert_details`: List of matched alert patterns - `num_alerts`: Number of alerts found ```python check_mol(mol: Chem.Mol) -> Tuple[bool, List[str]] ``` Check a single molecule for structural alerts. **Returns:** Tuple of (has_alerts, list_of_alert_names) ### Class: NIBRFilters Novartis NIBR medicinal chemistry filters. **Constructor:** ```python NIBRFilters() ``` **Methods:** ```python __call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> List[bool] ``` Apply NIBR filters to molecules. **Returns:** List of booleans (True if molecule passes) ### Class: LillyDemeritsFilters Eli Lilly's demerit-based structural alert system (275 rules). **Constructor:** ```python LillyDemeritsFilters() ``` **Methods:** ```python __call__(mols: List[Chem.Mol], n_jobs: int = 1, progress: bool = False) -> List[Dict] ``` Calculate Lilly demerits for molecules. **Returns:** List of dictionaries with keys: - `demerits`: Total demerit score - `passes`: Boolean (True if demerits ≤ 100) - `matched_patterns`: List of matched patterns with scores --- ## Module: medchem.functional High-level functional API for common operations. ### nibr_filter() ```python nibr_filter(mols: List[Chem.Mol], n_jobs: int = 1) -> List[bool] ``` Apply NIBR filters using functional API. **Parameters:** - `mols`: List of molecules - `n_jobs`: Parallelization level **Returns:** List of pass/fail booleans ### common_alerts_filter() ```python common_alerts_filter(mols: List[Chem.Mol], n_jobs: int = 1) -> List[Dict] ``` Apply common alerts filter using functional API. **Returns:** List of results dictionaries ### lilly_demerits_filter() ```python lilly_demerits_filter(mols: List[Chem.Mol], n_jobs: int = 1) -> List[Dict] ``` Calculate Lilly demerits using functional API. --- ## Module: medchem.groups ### Class: ChemicalGroup Detect specific chemical groups in molecules. **Constructor:** ```python ChemicalGroup(groups: List[str], custom_smarts: Optional[Dict[str, str]] = None) ``` **Parameters:** - `groups`: List of predefined group names - `custom_smarts`: Dictionary mapping custom group names to SMARTS patterns **Predefined Groups:** - `"hinge_binders"`: Kinase hinge binding motifs - `"phosphate_binders"`: Phosphate binding groups - `"michael_acceptors"`: Michael acceptor electrophiles - `"reactive_groups"`: General reactive functionalities **Methods:** ```python has_match(mols: List[Chem.Mol]) -> List[bool] ``` Check if molecules contain any of the specified groups. ```python get_matches(mol: Chem.Mol) -> Dict[str, List[Tuple]] ``` Get detailed match information for a single molecule. **Returns:** Dictionary mapping group names to lists of atom indices ```python get_all_matches(mols: List[Chem.Mol]) -> List[Dict] ``` Get match information for all molecules. **Example:** ```python group = mc.groups.ChemicalGroup(groups=["hinge_binders", "phosphate_binders"]) matches = group.get_all_matches(mol_list) ``` --- ## Module: medchem.catalogs ### Class: NamedCatalogs Access to curated chemical catalogs. **Available Catalogs:** - `"functional_groups"`: Common functional groups - `"protecting_groups"`: Protecting group structures - `"reagents"`: Common reagents - `"fragments"`: Standard fragments **Usage:** ```python catalog = mc.catalogs.NamedCatalogs.get("functional_groups") matches = catalog.get_matches(mol) ``` --- ## Module: medchem.complexity Calculate molecular complexity metrics. ### calculate_complexity() ```python calculate_complexity(mol: Chem.Mol, method: str = "bertz") -> float ``` Calculate complexity score for a molecule. **Parameters:** - `mol`: RDKit molecule - `method`: Complexity metric ("bertz", "whitlock", "barone") **Returns:** Complexity score (higher = more complex) ### Class: ComplexityFilter Filter molecules by complexity threshold. **Constructor:** ```python ComplexityFilter(max_complexity: float, method: str = "bertz") ``` **Methods:** ```python __call__(mols: List[Chem.Mol], n_jobs: int = 1) -> List[bool] ``` Filter molecules exceeding complexity threshold. --- ## Module: medchem.constraints ### Class: Constraints Apply custom property-based constraints. **Constructor:** ```python Constraints( mw_range: Optional[Tuple[float, float]] = None, logp_range: Optional[Tuple[float, float]] = None, tpsa_max: Optional[float] = None, tpsa_range: Optional[Tuple[float, float]] = None, hbd_max: Optional[int] = None, hba_max: Optional[int] = None, rotatable_bonds_max: Optional[int] = None, rings_range: Optional[Tuple[int, int]] = None, aromatic_rings_max: Optional[int] = None, ) ``` **Parameters:** All parameters are optional. Specify only the constraints needed. **Methods:** ```python __call__(mols: List[Chem.Mol], n_jobs: int = 1) -> List[Dict] ``` Apply constraints to molecules. **Returns:** List of dictionaries with keys: - `passes`: Boolean indicating if all constraints pass - `violations`: List of constraint names that failed **Example:** ```python constraints = mc.constraints.Constraints( mw_range=(200, 500), logp_range=(-2, 5), tpsa_max=140 ) results = constraints(mols=mol_list, n_jobs=-1) ``` --- ## Module: medchem.query Query language for complex filtering. ### parse() ```python parse(query: str) -> Query ``` Parse a medchem query string into a Query object. **Query Syntax:** - Operators: `AND`, `OR`, `NOT` - Comparisons: `<`, `>`, `<=`, `>=`, `==`, `!=` - Properties: `complexity`, `lilly_demerits`, `mw`, `logp`, `tpsa` - Rules: `rule_of_five`, `rule_of_cns`, etc. - Filters: `common_alerts`, `nibr_filter`, `pains_filter` **Example Queries:** ```python "rule_of_five AND NOT common_alerts" "rule_of_cns AND complexity < 400" "mw > 200 AND mw < 500 AND logp < 5" "(rule_of_five OR rule_of_oprea) AND NOT pains_filter" ``` ### Class: Query **Methods:** ```python apply(mols: List[Chem.Mol], n_jobs: int = 1) -> List[bool] ``` Apply parsed query to molecules. **Example:** ```python query = mc.query.parse("rule_of_five AND NOT common_alerts") results = query.apply(mols=mol_list, n_jobs=-1) passing_mols = [mol for mol, passes in zip(mol_list, results) if passes] ``` --- ## Module: medchem.utils Utility functions for working with molecules. ### batch_process() ```python batch_process( mols: List[Chem.Mol], func: Callable, n_jobs: int = 1, progress: bool = False, batch_size: Optional[int] = None ) -> List ``` Process molecules in parallel batches. **Parameters:** - `mols`: List of molecules - `func`: Function to apply to each molecule - `n_jobs`: Number of parallel workers - `progress`: Show progress bar - `batch_size`: Size of processing batches ### standardize_mol() ```python standardize_mol(mol: Chem.Mol) -> Chem.Mol ``` Standardize molecule representation (sanitize, neutralize charges, etc.). --- ## Common Patterns ### Pattern: Parallel Processing All filters support parallelization: ```python # Use all CPU cores results = filter_object(mols=mol_list, n_jobs=-1, progress=True) # Use specific number of cores results = filter_object(mols=mol_list, n_jobs=4, progress=True) ``` ### Pattern: Combining Multiple Filters ```python import medchem as mc # Apply multiple filters rule_filter = mc.rules.RuleFilters(rule_list=["rule_of_five"]) alert_filter = mc.structural.CommonAlertsFilters() lilly_filter = mc.structural.LillyDemeritsFilters() # Get results rule_results = rule_filter(mols=mol_list, n_jobs=-1) alert_results = alert_filter(mols=mol_list, n_jobs=-1) lilly_results = lilly_filter(mols=mol_list, n_jobs=-1) # Combine criteria passing_mols = [ mol for i, mol in enumerate(mol_list) if rule_results[i]["passes"] and not alert_results[i]["has_alerts"] and lilly_results[i]["passes"] ] ``` ### Pattern: Working with DataFrames ```python import pandas as pd import datamol as dm import medchem as mc # Load data df = pd.read_csv("molecules.csv") df["mol"] = df["smiles"].apply(dm.to_mol) # Apply filters rfilter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_cns"]) results = rfilter(mols=df["mol"].tolist(), n_jobs=-1) # Add results to dataframe df["passes_ro5"] = [r["rule_of_five"] for r in results] df["passes_cns"] = [r["rule_of_cns"] for r in results] # Filter dataframe filtered_df = df[df["passes_ro5"] & df["passes_cns"]] ```