Initial commit

2025-11-30 08:30:10 +08:00
commit f0bd18fb4e
824 changed files with 331919 additions and 0 deletions
--- a/skills/pytdc/SKILL.md
+++ b/skills/pytdc/SKILL.md
@@ -0,0 +1,454 @@
+---
+name: pytdc
+description: "Therapeutics Data Commons. AI-ready drug discovery datasets (ADME, toxicity, DTI), benchmarks, scaffold splits, molecular oracles, for therapeutic ML and pharmacological prediction."
+---
+
+# PyTDC (Therapeutics Data Commons)
+
+## Overview
+
+PyTDC is an open-science platform providing AI-ready datasets and benchmarks for drug discovery and development. Access curated datasets spanning the entire therapeutics pipeline with standardized evaluation metrics and meaningful data splits, organized into three categories: single-instance prediction (molecular/protein properties), multi-instance prediction (drug-target interactions, DDI), and generation (molecule generation, retrosynthesis).
+
+## When to Use This Skill
+
+This skill should be used when:
+- Working with drug discovery or therapeutic ML datasets
+- Benchmarking machine learning models on standardized pharmaceutical tasks
+- Predicting molecular properties (ADME, toxicity, bioactivity)
+- Predicting drug-target or drug-drug interactions
+- Generating novel molecules with desired properties
+- Accessing curated datasets with proper train/test splits (scaffold, cold-split)
+- Using molecular oracles for property optimization
+
+## Installation & Setup
+
+Install PyTDC using pip:
+
+```bash
+uv pip install PyTDC
+```
+
+To upgrade to the latest version:
+
+```bash
+uv pip install PyTDC --upgrade
+```
+
+Core dependencies (automatically installed):
+- numpy, pandas, tqdm, seaborn, scikit_learn, fuzzywuzzy
+
+Additional packages are installed automatically as needed for specific features.
+
+## Quick Start
+
+The basic pattern for accessing any TDC dataset follows this structure:
+
+```python
+from tdc.<problem> import <Task>
+data = <Task>(name='<Dataset>')
+split = data.get_split(method='scaffold', seed=1, frac=[0.7, 0.1, 0.2])
+df = data.get_data(format='df')
+```
+
+Where:
+- `<problem>`: One of `single_pred`, `multi_pred`, or `generation`
+- `<Task>`: Specific task category (e.g., ADME, DTI, MolGen)
+- `<Dataset>`: Dataset name within that task
+
+**Example - Loading ADME data:**
+
+```python
+from tdc.single_pred import ADME
+data = ADME(name='Caco2_Wang')
+split = data.get_split(method='scaffold')
+# Returns dict with 'train', 'valid', 'test' DataFrames
+```
+
+## Single-Instance Prediction Tasks
+
+Single-instance prediction involves forecasting properties of individual biomedical entities (molecules, proteins, etc.).
+
+### Available Task Categories
+
+#### 1. ADME (Absorption, Distribution, Metabolism, Excretion)
+
+Predict pharmacokinetic properties of drug molecules.
+
+```python
+from tdc.single_pred import ADME
+data = ADME(name='Caco2_Wang')  # Intestinal permeability
+# Other datasets: HIA_Hou, Bioavailability_Ma, Lipophilicity_AstraZeneca, etc.
+```
+
+**Common ADME datasets:**
+- Caco2 - Intestinal permeability
+- HIA - Human intestinal absorption
+- Bioavailability - Oral bioavailability
+- Lipophilicity - Octanol-water partition coefficient
+- Solubility - Aqueous solubility
+- BBB - Blood-brain barrier penetration
+- CYP - Cytochrome P450 metabolism
+
+#### 2. Toxicity (Tox)
+
+Predict toxicity and adverse effects of compounds.
+
+```python
+from tdc.single_pred import Tox
+data = Tox(name='hERG')  # Cardiotoxicity
+# Other datasets: AMES, DILI, Carcinogens_Lagunin, etc.
+```
+
+**Common toxicity datasets:**
+- hERG - Cardiac toxicity
+- AMES - Mutagenicity
+- DILI - Drug-induced liver injury
+- Carcinogens - Carcinogenicity
+- ClinTox - Clinical trial toxicity
+
+#### 3. HTS (High-Throughput Screening)
+
+Bioactivity predictions from screening data.
+
+```python
+from tdc.single_pred import HTS
+data = HTS(name='SARSCoV2_Vitro_Touret')
+```
+
+#### 4. QM (Quantum Mechanics)
+
+Quantum mechanical properties of molecules.
+
+```python
+from tdc.single_pred import QM
+data = QM(name='QM7')
+```
+
+#### 5. Other Single Prediction Tasks
+
+- **Yields**: Chemical reaction yield prediction
+- **Epitope**: Epitope prediction for biologics
+- **Develop**: Development-stage predictions
+- **CRISPROutcome**: Gene editing outcome prediction
+
+### Data Format
+
+Single prediction datasets typically return DataFrames with columns:
+- `Drug_ID` or `Compound_ID`: Unique identifier
+- `Drug` or `X`: SMILES string or molecular representation
+- `Y`: Target label (continuous or binary)
+
+## Multi-Instance Prediction Tasks
+
+Multi-instance prediction involves forecasting properties of interactions between multiple biomedical entities.
+
+### Available Task Categories
+
+#### 1. DTI (Drug-Target Interaction)
+
+Predict binding affinity between drugs and protein targets.
+
+```python
+from tdc.multi_pred import DTI
+data = DTI(name='BindingDB_Kd')
+split = data.get_split()
+```
+
+**Available datasets:**
+- BindingDB_Kd - Dissociation constant (52,284 pairs)
+- BindingDB_IC50 - Half-maximal inhibitory concentration (991,486 pairs)
+- BindingDB_Ki - Inhibition constant (375,032 pairs)
+- DAVIS, KIBA - Kinase binding datasets
+
+**Data format:** Drug_ID, Target_ID, Drug (SMILES), Target (sequence), Y (binding affinity)
+
+#### 2. DDI (Drug-Drug Interaction)
+
+Predict interactions between drug pairs.
+
+```python
+from tdc.multi_pred import DDI
+data = DDI(name='DrugBank')
+split = data.get_split()
+```
+
+Multi-class classification task predicting interaction types. Dataset contains 191,808 DDI pairs with 1,706 drugs.
+
+#### 3. PPI (Protein-Protein Interaction)
+
+Predict protein-protein interactions.
+
+```python
+from tdc.multi_pred import PPI
+data = PPI(name='HuRI')
+```
+
+#### 4. Other Multi-Prediction Tasks
+
+- **GDA**: Gene-disease associations
+- **DrugRes**: Drug resistance prediction
+- **DrugSyn**: Drug synergy prediction
+- **PeptideMHC**: Peptide-MHC binding
+- **AntibodyAff**: Antibody affinity prediction
+- **MTI**: miRNA-target interactions
+- **Catalyst**: Catalyst prediction
+- **TrialOutcome**: Clinical trial outcome prediction
+
+## Generation Tasks
+
+Generation tasks involve creating novel biomedical entities with desired properties.
+
+### 1. Molecular Generation (MolGen)
+
+Generate diverse, novel molecules with desirable chemical properties.
+
+```python
+from tdc.generation import MolGen
+data = MolGen(name='ChEMBL_V29')
+split = data.get_split()
+```
+
+Use with oracles to optimize for specific properties:
+
+```python
+from tdc import Oracle
+oracle = Oracle(name='GSK3B')
+score = oracle('CC(C)Cc1ccc(cc1)C(C)C(O)=O')  # Evaluate SMILES
+```
+
+See `references/oracles.md` for all available oracle functions.
+
+### 2. Retrosynthesis (RetroSyn)
+
+Predict reactants needed to synthesize a target molecule.
+
+```python
+from tdc.generation import RetroSyn
+data = RetroSyn(name='USPTO')
+split = data.get_split()
+```
+
+Dataset contains 1,939,253 reactions from USPTO database.
+
+### 3. Paired Molecule Generation
+
+Generate molecule pairs (e.g., prodrug-drug pairs).
+
+```python
+from tdc.generation import PairMolGen
+data = PairMolGen(name='Prodrug')
+```
+
+For detailed oracle documentation and molecular generation workflows, refer to `references/oracles.md` and `scripts/molecular_generation.py`.
+
+## Benchmark Groups
+
+Benchmark groups provide curated collections of related datasets for systematic model evaluation.
+
+### ADMET Benchmark Group
+
+```python
+from tdc.benchmark_group import admet_group
+group = admet_group(path='data/')
+
+# Get benchmark datasets
+benchmark = group.get('Caco2_Wang')
+predictions = {}
+
+for seed in [1, 2, 3, 4, 5]:
+    train, valid = benchmark['train'], benchmark['valid']
+    # Train model here
+    predictions[seed] = model.predict(benchmark['test'])
+
+# Evaluate with required 5 seeds
+results = group.evaluate(predictions)
+```
+
+**ADMET Group includes 22 datasets** covering absorption, distribution, metabolism, excretion, and toxicity.
+
+### Other Benchmark Groups
+
+Available benchmark groups include collections for:
+- ADMET properties
+- Drug-target interactions
+- Drug combination prediction
+- And more specialized therapeutic tasks
+
+For benchmark evaluation workflows, see `scripts/benchmark_evaluation.py`.
+
+## Data Functions
+
+TDC provides comprehensive data processing utilities organized into four categories.
+
+### 1. Dataset Splits
+
+Retrieve train/validation/test partitions with various strategies:
+
+```python
+# Scaffold split (default for most tasks)
+split = data.get_split(method='scaffold', seed=1, frac=[0.7, 0.1, 0.2])
+
+# Random split
+split = data.get_split(method='random', seed=42, frac=[0.8, 0.1, 0.1])
+
+# Cold split (for DTI/DDI tasks)
+split = data.get_split(method='cold_drug', seed=1)  # Unseen drugs in test
+split = data.get_split(method='cold_target', seed=1)  # Unseen targets in test
+```
+
+**Available split strategies:**
+- `random`: Random shuffling
+- `scaffold`: Scaffold-based (for chemical diversity)
+- `cold_drug`, `cold_target`, `cold_drug_target`: For DTI tasks
+- `temporal`: Time-based splits for temporal datasets
+
+### 2. Model Evaluation
+
+Use standardized metrics for evaluation:
+
+```python
+from tdc import Evaluator
+
+# For binary classification
+evaluator = Evaluator(name='ROC-AUC')
+score = evaluator(y_true, y_pred)
+
+# For regression
+evaluator = Evaluator(name='RMSE')
+score = evaluator(y_true, y_pred)
+```
+
+**Available metrics:** ROC-AUC, PR-AUC, F1, Accuracy, RMSE, MAE, R2, Spearman, Pearson, and more.
+
+### 3. Data Processing
+
+TDC provides 11 key processing utilities:
+
+```python
+from tdc.chem_utils import MolConvert
+
+# Molecule format conversion
+converter = MolConvert(src='SMILES', dst='PyG')
+pyg_graph = converter('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+```
+
+**Processing utilities include:**
+- Molecule format conversion (SMILES, SELFIES, PyG, DGL, ECFP, etc.)
+- Molecule filters (PAINS, drug-likeness)
+- Label binarization and unit conversion
+- Data balancing (over/under-sampling)
+- Negative sampling for pair data
+- Graph transformation
+- Entity retrieval (CID to SMILES, UniProt to sequence)
+
+For comprehensive utilities documentation, see `references/utilities.md`.
+
+### 4. Molecule Generation Oracles
+
+TDC provides 17+ oracle functions for molecular optimization:
+
+```python
+from tdc import Oracle
+
+# Single oracle
+oracle = Oracle(name='DRD2')
+score = oracle('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+
+# Multiple oracles
+oracle = Oracle(name='JNK3')
+scores = oracle(['SMILES1', 'SMILES2', 'SMILES3'])
+```
+
+For complete oracle documentation, see `references/oracles.md`.
+
+## Advanced Features
+
+### Retrieve Available Datasets
+
+```python
+from tdc.utils import retrieve_dataset_names
+
+# Get all ADME datasets
+adme_datasets = retrieve_dataset_names('ADME')
+
+# Get all DTI datasets
+dti_datasets = retrieve_dataset_names('DTI')
+```
+
+### Label Transformations
+
+```python
+# Get label mapping
+label_map = data.get_label_map(name='DrugBank')
+
+# Convert labels
+from tdc.chem_utils import label_transform
+transformed = label_transform(y, from_unit='nM', to_unit='p')
+```
+
+### Database Queries
+
+```python
+from tdc.utils import cid2smiles, uniprot2seq
+
+# Convert PubChem CID to SMILES
+smiles = cid2smiles(2244)
+
+# Convert UniProt ID to amino acid sequence
+sequence = uniprot2seq('P12345')
+```
+
+## Common Workflows
+
+### Workflow 1: Train a Single Prediction Model
+
+See `scripts/load_and_split_data.py` for a complete example:
+
+```python
+from tdc.single_pred import ADME
+from tdc import Evaluator
+
+# Load data
+data = ADME(name='Caco2_Wang')
+split = data.get_split(method='scaffold', seed=42)
+
+train, valid, test = split['train'], split['valid'], split['test']
+
+# Train model (user implements)
+# model.fit(train['Drug'], train['Y'])
+
+# Evaluate
+evaluator = Evaluator(name='MAE')
+# score = evaluator(test['Y'], predictions)
+```
+
+### Workflow 2: Benchmark Evaluation
+
+See `scripts/benchmark_evaluation.py` for a complete example with multiple seeds and proper evaluation protocol.
+
+### Workflow 3: Molecular Generation with Oracles
+
+See `scripts/molecular_generation.py` for an example of goal-directed generation using oracle functions.
+
+## Resources
+
+This skill includes bundled resources for common TDC workflows:
+
+### scripts/
+
+- `load_and_split_data.py`: Template for loading and splitting TDC datasets with various strategies
+- `benchmark_evaluation.py`: Template for running benchmark group evaluations with proper 5-seed protocol
+- `molecular_generation.py`: Template for molecular generation using oracle functions
+
+### references/
+
+- `datasets.md`: Comprehensive catalog of all available datasets organized by task type
+- `oracles.md`: Complete documentation of all 17+ molecule generation oracles
+- `utilities.md`: Detailed guide to data processing, splitting, and evaluation utilities
+
+## Additional Resources
+
+- **Official Website**: https://tdcommons.ai
+- **Documentation**: https://tdc.readthedocs.io
+- **GitHub**: https://github.com/mims-harvard/TDC
+- **Paper**: NeurIPS 2021 - "Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development"
--- a/skills/pytdc/references/datasets.md
+++ b/skills/pytdc/references/datasets.md
@@ -0,0 +1,246 @@
+# TDC Datasets Comprehensive Catalog
+
+This document provides a comprehensive catalog of all available datasets in the Therapeutics Data Commons, organized by task category.
+
+## Single-Instance Prediction Datasets
+
+### ADME (Absorption, Distribution, Metabolism, Excretion)
+
+**Absorption:**
+- `Caco2_Wang` - Caco-2 cell permeability (906 compounds)
+- `Caco2_AstraZeneca` - Caco-2 permeability from AstraZeneca (700 compounds)
+- `HIA_Hou` - Human intestinal absorption (578 compounds)
+- `Pgp_Broccatelli` - P-glycoprotein inhibition (1,212 compounds)
+- `Bioavailability_Ma` - Oral bioavailability (640 compounds)
+- `F20_edrug3d` - Oral bioavailability F>=20% (1,017 compounds)
+- `F30_edrug3d` - Oral bioavailability F>=30% (1,017 compounds)
+
+**Distribution:**
+- `BBB_Martins` - Blood-brain barrier penetration (1,975 compounds)
+- `PPBR_AZ` - Plasma protein binding rate (1,797 compounds)
+- `VDss_Lombardo` - Volume of distribution at steady state (1,130 compounds)
+
+**Metabolism:**
+- `CYP2C19_Veith` - CYP2C19 inhibition (12,665 compounds)
+- `CYP2D6_Veith` - CYP2D6 inhibition (13,130 compounds)
+- `CYP3A4_Veith` - CYP3A4 inhibition (12,328 compounds)
+- `CYP1A2_Veith` - CYP1A2 inhibition (12,579 compounds)
+- `CYP2C9_Veith` - CYP2C9 inhibition (12,092 compounds)
+- `CYP2C9_Substrate_CarbonMangels` - CYP2C9 substrate (666 compounds)
+- `CYP2D6_Substrate_CarbonMangels` - CYP2D6 substrate (664 compounds)
+- `CYP3A4_Substrate_CarbonMangels` - CYP3A4 substrate (667 compounds)
+
+**Excretion:**
+- `Half_Life_Obach` - Half-life (667 compounds)
+- `Clearance_Hepatocyte_AZ` - Hepatocyte clearance (1,020 compounds)
+- `Clearance_Microsome_AZ` - Microsome clearance (1,102 compounds)
+
+**Solubility & Lipophilicity:**
+- `Solubility_AqSolDB` - Aqueous solubility (9,982 compounds)
+- `Lipophilicity_AstraZeneca` - Lipophilicity (logD) (4,200 compounds)
+- `HydrationFreeEnergy_FreeSolv` - Hydration free energy (642 compounds)
+
+### Toxicity
+
+**Organ Toxicity:**
+- `hERG` - hERG channel inhibition/cardiotoxicity (648 compounds)
+- `hERG_Karim` - hERG blockers extended dataset (13,445 compounds)
+- `DILI` - Drug-induced liver injury (475 compounds)
+- `Skin_Reaction` - Skin reaction (404 compounds)
+- `Carcinogens_Lagunin` - Carcinogenicity (278 compounds)
+- `Respiratory_Toxicity` - Respiratory toxicity (278 compounds)
+
+**General Toxicity:**
+- `AMES` - Ames mutagenicity (7,255 compounds)
+- `LD50_Zhu` - Acute toxicity LD50 (7,385 compounds)
+- `ClinTox` - Clinical trial toxicity (1,478 compounds)
+- `SkinSensitization` - Skin sensitization (278 compounds)
+- `EyeCorrosion` - Eye corrosion (278 compounds)
+- `EyeIrritation` - Eye irritation (278 compounds)
+
+**Environmental Toxicity:**
+- `Tox21-AhR` - Nuclear receptor signaling (8,169 compounds)
+- `Tox21-AR` - Androgen receptor (9,362 compounds)
+- `Tox21-AR-LBD` - Androgen receptor ligand binding (8,343 compounds)
+- `Tox21-ARE` - Antioxidant response element (6,475 compounds)
+- `Tox21-aromatase` - Aromatase inhibition (6,733 compounds)
+- `Tox21-ATAD5` - DNA damage (8,163 compounds)
+- `Tox21-ER` - Estrogen receptor (7,257 compounds)
+- `Tox21-ER-LBD` - Estrogen receptor ligand binding (8,163 compounds)
+- `Tox21-HSE` - Heat shock response (8,162 compounds)
+- `Tox21-MMP` - Mitochondrial membrane potential (7,394 compounds)
+- `Tox21-p53` - p53 pathway (8,163 compounds)
+- `Tox21-PPAR-gamma` - PPAR gamma activation (7,396 compounds)
+
+### HTS (High-Throughput Screening)
+
+**SARS-CoV-2:**
+- `SARSCoV2_Vitro_Touret` - In vitro antiviral activity (1,484 compounds)
+- `SARSCoV2_3CLPro_Diamond` - 3CL protease inhibition (879 compounds)
+- `SARSCoV2_Vitro_AlabdulKareem` - In vitro screening (5,953 compounds)
+
+**Other Targets:**
+- `Orexin1_Receptor_Butkiewicz` - Orexin receptor screening (4,675 compounds)
+- `M1_Receptor_Agonist_Butkiewicz` - M1 receptor agonist (1,700 compounds)
+- `M1_Receptor_Antagonist_Butkiewicz` - M1 receptor antagonist (1,700 compounds)
+- `HIV_Butkiewicz` - HIV inhibition (40,000+ compounds)
+- `ToxCast` - Environmental chemical screening (8,597 compounds)
+
+### QM (Quantum Mechanics)
+
+- `QM7` - Quantum mechanics properties (7,160 molecules)
+- `QM8` - Electronic spectra and excited states (21,786 molecules)
+- `QM9` - Geometric, energetic, electronic, thermodynamic properties (133,885 molecules)
+
+### Yields
+
+- `Buchwald-Hartwig` - Reaction yield prediction (3,955 reactions)
+- `USPTO_Yields` - Yield prediction from USPTO (853,879 reactions)
+
+### Epitope
+
+- `IEDBpep-DiseaseBinder` - Disease-associated epitope binding (6,080 peptides)
+- `IEDBpep-NonBinder` - Non-binding peptides (24,320 peptides)
+
+### Develop (Development)
+
+- `Manufacturing` - Manufacturing success prediction
+- `Formulation` - Formulation stability
+
+### CRISPROutcome
+
+- `CRISPROutcome_Doench` - Gene editing efficiency prediction (5,310 guide RNAs)
+
+## Multi-Instance Prediction Datasets
+
+### DTI (Drug-Target Interaction)
+
+**Binding Affinity:**
+- `BindingDB_Kd` - Dissociation constant (52,284 pairs, 10,665 drugs, 1,413 proteins)
+- `BindingDB_IC50` - Half-maximal inhibitory concentration (991,486 pairs, 549,205 drugs, 5,078 proteins)
+- `BindingDB_Ki` - Inhibition constant (375,032 pairs, 174,662 drugs, 3,070 proteins)
+
+**Kinase Binding:**
+- `DAVIS` - Davis kinase binding dataset (30,056 pairs, 68 drugs, 442 proteins)
+- `KIBA` - KIBA kinase binding dataset (118,254 pairs, 2,111 drugs, 229 proteins)
+
+**Binary Interaction:**
+- `BindingDB_Patent` - Patent-derived DTI (8,503 pairs)
+- `BindingDB_Approval` - FDA-approved drug DTI (1,649 pairs)
+
+### DDI (Drug-Drug Interaction)
+
+- `DrugBank` - Drug-drug interactions (191,808 pairs, 1,706 drugs)
+- `TWOSIDES` - Side effect-based DDI (4,649,441 pairs, 645 drugs)
+
+### PPI (Protein-Protein Interaction)
+
+- `HuRI` - Human reference protein interactome (52,569 interactions)
+- `STRING` - Protein functional associations (19,247 interactions)
+
+### GDA (Gene-Disease Association)
+
+- `DisGeNET` - Gene-disease associations (81,746 pairs)
+- `PrimeKG_GDA` - Gene-disease from PrimeKG knowledge graph
+
+### DrugRes (Drug Response/Resistance)
+
+- `GDSC1` - Genomics of Drug Sensitivity in Cancer v1 (178,000 pairs)
+- `GDSC2` - Genomics of Drug Sensitivity in Cancer v2 (125,000 pairs)
+
+### DrugSyn (Drug Synergy)
+
+- `DrugComb` - Drug combination synergy (345,502 combinations)
+- `DrugCombDB` - Drug combination database (448,555 combinations)
+- `OncoPolyPharmacology` - Oncology drug combinations (22,737 combinations)
+
+### PeptideMHC
+
+- `MHC1_NetMHCpan` - MHC class I binding (184,983 pairs)
+- `MHC2_NetMHCIIpan` - MHC class II binding (134,281 pairs)
+
+### AntibodyAff (Antibody Affinity)
+
+- `Protein_SAbDab` - Antibody-antigen affinity (1,500+ pairs)
+
+### MTI (miRNA-Target Interaction)
+
+- `miRTarBase` - Experimentally validated miRNA-target interactions (380,639 pairs)
+
+### Catalyst
+
+- `USPTO_Catalyst` - Catalyst prediction for reactions (11,000+ reactions)
+
+### TrialOutcome
+
+- `TrialOutcome_WuXi` - Clinical trial outcome prediction (3,769 trials)
+
+## Generation Datasets
+
+### MolGen (Molecular Generation)
+
+- `ChEMBL_V29` - Drug-like molecules from ChEMBL (1,941,410 molecules)
+- `ZINC` - ZINC database subset (100,000+ molecules)
+- `GuacaMol` - Goal-directed benchmark molecules
+- `Moses` - Molecular sets benchmark (1,936,962 molecules)
+
+### RetroSyn (Retrosynthesis)
+
+- `USPTO` - Retrosynthesis from USPTO patents (1,939,253 reactions)
+- `USPTO-50K` - Curated USPTO subset (50,000 reactions)
+
+### PairMolGen (Paired Molecule Generation)
+
+- `Prodrug` - Prodrug to drug transformations (1,000+ pairs)
+- `Metabolite` - Drug to metabolite transformations
+
+## Using retrieve_dataset_names
+
+To programmatically access all available datasets for a specific task:
+
+```python
+from tdc.utils import retrieve_dataset_names
+
+# Get all datasets for a specific task
+adme_datasets = retrieve_dataset_names('ADME')
+tox_datasets = retrieve_dataset_names('Tox')
+dti_datasets = retrieve_dataset_names('DTI')
+hts_datasets = retrieve_dataset_names('HTS')
+```
+
+## Dataset Statistics
+
+Access dataset statistics directly:
+
+```python
+from tdc.single_pred import ADME
+data = ADME(name='Caco2_Wang')
+
+# Print basic statistics
+data.print_stats()
+
+# Get label distribution
+data.label_distribution()
+```
+
+## Loading Datasets
+
+All datasets follow the same loading pattern:
+
+```python
+from tdc.<problem_type> import <TaskType>
+data = <TaskType>(name='<DatasetName>')
+
+# Get full dataset
+df = data.get_data(format='df')  # or 'dict', 'DeepPurpose', etc.
+
+# Get train/valid/test split
+split = data.get_split(method='scaffold', seed=1, frac=[0.7, 0.1, 0.2])
+```
+
+## Notes
+
+- Dataset sizes and statistics are approximate and may be updated
+- New datasets are regularly added to TDC
+- Some datasets may require additional dependencies
+- Check the official TDC website for the most up-to-date dataset list: https://tdcommons.ai/overview/
--- a/skills/pytdc/references/oracles.md
+++ b/skills/pytdc/references/oracles.md
@@ -0,0 +1,400 @@
+# TDC Molecule Generation Oracles
+
+Oracles are functions that evaluate the quality of generated molecules across specific dimensions. TDC provides 17+ oracle functions for molecular optimization tasks in de novo drug design.
+
+## Overview
+
+Oracles measure molecular properties and serve two main purposes:
+
+1. **Goal-Directed Generation**: Optimize molecules to maximize/minimize specific properties
+2. **Distribution Learning**: Evaluate whether generated molecules match desired property distributions
+
+## Using Oracles
+
+### Basic Usage
+
+```python
+from tdc import Oracle
+
+# Initialize oracle
+oracle = Oracle(name='GSK3B')
+
+# Evaluate single molecule (SMILES string)
+score = oracle('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+
+# Evaluate multiple molecules
+scores = oracle(['SMILES1', 'SMILES2', 'SMILES3'])
+```
+
+### Oracle Categories
+
+TDC oracles are organized into several categories based on the molecular property being evaluated.
+
+## Biochemical Oracles
+
+Predict binding affinity or activity against biological targets.
+
+### Target-Specific Oracles
+
+**DRD2 - Dopamine Receptor D2**
+```python
+oracle = Oracle(name='DRD2')
+score = oracle(smiles)
+```
+- Measures binding affinity to DRD2 receptor
+- Important for neurological and psychiatric drug development
+- Higher scores indicate stronger binding
+
+**GSK3B - Glycogen Synthase Kinase-3 Beta**
+```python
+oracle = Oracle(name='GSK3B')
+score = oracle(smiles)
+```
+- Predicts GSK3β inhibition
+- Relevant for Alzheimer's, diabetes, and cancer research
+- Higher scores indicate better inhibition
+
+**JNK3 - c-Jun N-terminal Kinase 3**
+```python
+oracle = Oracle(name='JNK3')
+score = oracle(smiles)
+```
+- Measures JNK3 kinase inhibition
+- Target for neurodegenerative diseases
+- Higher scores indicate stronger inhibition
+
+**5HT2A - Serotonin 2A Receptor**
+```python
+oracle = Oracle(name='5HT2A')
+score = oracle(smiles)
+```
+- Predicts serotonin receptor binding
+- Important for psychiatric medications
+- Higher scores indicate stronger binding
+
+**ACE - Angiotensin-Converting Enzyme**
+```python
+oracle = Oracle(name='ACE')
+score = oracle(smiles)
+```
+- Measures ACE inhibition
+- Target for hypertension treatment
+- Higher scores indicate better inhibition
+
+**MAPK - Mitogen-Activated Protein Kinase**
+```python
+oracle = Oracle(name='MAPK')
+score = oracle(smiles)
+```
+- Predicts MAPK inhibition
+- Target for cancer and inflammatory diseases
+
+**CDK - Cyclin-Dependent Kinase**
+```python
+oracle = Oracle(name='CDK')
+score = oracle(smiles)
+```
+- Measures CDK inhibition
+- Important for cancer drug development
+
+**P38 - p38 MAP Kinase**
+```python
+oracle = Oracle(name='P38')
+score = oracle(smiles)
+```
+- Predicts p38 MAPK inhibition
+- Target for inflammatory diseases
+
+**PARP1 - Poly (ADP-ribose) Polymerase 1**
+```python
+oracle = Oracle(name='PARP1')
+score = oracle(smiles)
+```
+- Measures PARP1 inhibition
+- Target for cancer treatment (DNA repair mechanism)
+
+**PIK3CA - Phosphatidylinositol-4,5-Bisphosphate 3-Kinase**
+```python
+oracle = Oracle(name='PIK3CA')
+score = oracle(smiles)
+```
+- Predicts PIK3CA inhibition
+- Important target in oncology
+
+## Physicochemical Oracles
+
+Evaluate drug-like properties and ADME characteristics.
+
+### Drug-Likeness Oracles
+
+**QED - Quantitative Estimate of Drug-likeness**
+```python
+oracle = Oracle(name='QED')
+score = oracle(smiles)
+```
+- Combines multiple physicochemical properties
+- Score ranges from 0 (non-drug-like) to 1 (drug-like)
+- Based on Bickerton et al. criteria
+
+**Lipinski - Rule of Five**
+```python
+oracle = Oracle(name='Lipinski')
+score = oracle(smiles)
+```
+- Number of Lipinski rule violations
+- Rules: MW ≤ 500, logP ≤ 5, HBD ≤ 5, HBA ≤ 10
+- Score of 0 means fully compliant
+
+### Molecular Properties
+
+**SA - Synthetic Accessibility**
+```python
+oracle = Oracle(name='SA')
+score = oracle(smiles)
+```
+- Estimates ease of synthesis
+- Score ranges from 1 (easy) to 10 (difficult)
+- Lower scores indicate easier synthesis
+
+**LogP - Octanol-Water Partition Coefficient**
+```python
+oracle = Oracle(name='LogP')
+score = oracle(smiles)
+```
+- Measures lipophilicity
+- Important for membrane permeability
+- Typical drug-like range: 0-5
+
+**MW - Molecular Weight**
+```python
+oracle = Oracle(name='MW')
+score = oracle(smiles)
+```
+- Returns molecular weight in Daltons
+- Drug-like range typically 150-500 Da
+
+## Composite Oracles
+
+Combine multiple properties for multi-objective optimization.
+
+**Isomer Meta**
+```python
+oracle = Oracle(name='Isomer_Meta')
+score = oracle(smiles)
+```
+- Evaluates specific isomeric properties
+- Used for stereochemistry optimization
+
+**Median Molecules**
+```python
+oracle = Oracle(name='Median1', 'Median2')
+score = oracle(smiles)
+```
+- Tests ability to generate molecules with median properties
+- Useful for distribution learning benchmarks
+
+**Rediscovery**
+```python
+oracle = Oracle(name='Rediscovery')
+score = oracle(smiles)
+```
+- Measures similarity to known reference molecules
+- Tests ability to regenerate existing drugs
+
+**Similarity**
+```python
+oracle = Oracle(name='Similarity')
+score = oracle(smiles)
+```
+- Computes structural similarity to target molecules
+- Based on molecular fingerprints (typically Tanimoto similarity)
+
+**Uniqueness**
+```python
+oracle = Oracle(name='Uniqueness')
+scores = oracle(smiles_list)
+```
+- Measures diversity in generated molecule set
+- Returns fraction of unique molecules
+
+**Novelty**
+```python
+oracle = Oracle(name='Novelty')
+scores = oracle(smiles_list, training_set)
+```
+- Measures how different generated molecules are from training set
+- Higher scores indicate more novel structures
+
+## Specialized Oracles
+
+**ASKCOS - Retrosynthesis Scoring**
+```python
+oracle = Oracle(name='ASKCOS')
+score = oracle(smiles)
+```
+- Evaluates synthetic feasibility using retrosynthesis
+- Requires ASKCOS backend (IBM RXN)
+- Scores based on retrosynthetic route availability
+
+**Docking Score**
+```python
+oracle = Oracle(name='Docking')
+score = oracle(smiles)
+```
+- Molecular docking score against target protein
+- Requires protein structure and docking software
+- Lower scores typically indicate better binding
+
+**Vina - AutoDock Vina Score**
+```python
+oracle = Oracle(name='Vina')
+score = oracle(smiles)
+```
+- Uses AutoDock Vina for protein-ligand docking
+- Predicts binding affinity in kcal/mol
+- More negative scores indicate stronger binding
+
+## Multi-Objective Optimization
+
+Combine multiple oracles for multi-property optimization:
+
+```python
+from tdc import Oracle
+
+# Initialize multiple oracles
+qed_oracle = Oracle(name='QED')
+sa_oracle = Oracle(name='SA')
+drd2_oracle = Oracle(name='DRD2')
+
+# Define custom scoring function
+def multi_objective_score(smiles):
+    qed = qed_oracle(smiles)
+    sa = 1 / (1 + sa_oracle(smiles))  # Invert SA (lower is better)
+    drd2 = drd2_oracle(smiles)
+
+    # Weighted combination
+    return 0.3 * qed + 0.3 * sa + 0.4 * drd2
+
+# Evaluate molecule
+score = multi_objective_score('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+```
+
+## Oracle Performance Considerations
+
+### Speed
+- **Fast**: QED, SA, LogP, MW, Lipinski (rule-based calculations)
+- **Medium**: Target-specific ML models (DRD2, GSK3B, etc.)
+- **Slow**: Docking-based oracles (Vina, ASKCOS)
+
+### Reliability
+- Oracles are ML models trained on specific datasets
+- May not generalize to all chemical spaces
+- Use multiple oracles to validate results
+
+### Batch Processing
+```python
+# Efficient batch evaluation
+oracle = Oracle(name='GSK3B')
+smiles_list = ['SMILES1', 'SMILES2', ..., 'SMILES1000']
+scores = oracle(smiles_list)  # Faster than individual calls
+```
+
+## Common Workflows
+
+### Goal-Directed Generation
+```python
+from tdc import Oracle
+from tdc.generation import MolGen
+
+# Load training data
+data = MolGen(name='ChEMBL_V29')
+train_smiles = data.get_data()['Drug'].tolist()
+
+# Initialize oracle
+oracle = Oracle(name='GSK3B')
+
+# Generate molecules (user implements generative model)
+# generated_smiles = generator.generate(n=1000)
+
+# Evaluate generated molecules
+scores = oracle(generated_smiles)
+best_molecules = [(s, score) for s, score in zip(generated_smiles, scores)]
+best_molecules.sort(key=lambda x: x[1], reverse=True)
+
+print(f"Top 10 molecules:")
+for smiles, score in best_molecules[:10]:
+    print(f"{smiles}: {score:.3f}")
+```
+
+### Distribution Learning
+```python
+from tdc import Oracle
+import numpy as np
+
+# Initialize oracle
+oracle = Oracle(name='QED')
+
+# Evaluate training set
+train_scores = oracle(train_smiles)
+train_mean = np.mean(train_scores)
+train_std = np.std(train_scores)
+
+# Evaluate generated set
+gen_scores = oracle(generated_smiles)
+gen_mean = np.mean(gen_scores)
+gen_std = np.std(gen_scores)
+
+# Compare distributions
+print(f"Training: μ={train_mean:.3f}, σ={train_std:.3f}")
+print(f"Generated: μ={gen_mean:.3f}, σ={gen_std:.3f}")
+```
+
+## Integration with TDC Benchmarks
+
+```python
+from tdc.generation import MolGen
+
+# Use with GuacaMol benchmark
+data = MolGen(name='GuacaMol')
+
+# Oracles are automatically integrated
+# Each GuacaMol task has associated oracle
+benchmark_results = data.evaluate_guacamol(
+    generated_molecules=your_molecules,
+    oracle_name='GSK3B'
+)
+```
+
+## Notes
+
+- Oracle scores are predictions, not experimental measurements
+- Always validate top candidates experimentally
+- Different oracles may have different score ranges and interpretations
+- Some oracles require additional dependencies or API access
+- Check oracle documentation for specific details: https://tdcommons.ai/functions/oracles/
+
+## Adding Custom Oracles
+
+To create custom oracle functions:
+
+```python
+class CustomOracle:
+    def __init__(self):
+        # Initialize your model/method
+        pass
+
+    def __call__(self, smiles):
+        # Implement your scoring logic
+        # Return score or list of scores
+        pass
+
+# Use like built-in oracles
+custom_oracle = CustomOracle()
+score = custom_oracle('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+```
+
+## References
+
+- TDC Oracles Documentation: https://tdcommons.ai/functions/oracles/
+- GuacaMol Paper: "GuacaMol: Benchmarking Models for de Novo Molecular Design"
+- MOSES Paper: "Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models"
--- a/skills/pytdc/references/utilities.md
+++ b/skills/pytdc/references/utilities.md
@@ -0,0 +1,684 @@
+# TDC Utilities and Data Functions
+
+This document provides comprehensive documentation for TDC's data processing, evaluation, and utility functions.
+
+## Overview
+
+TDC provides utilities organized into four main categories:
+1. **Dataset Splits** - Train/validation/test partitioning strategies
+2. **Model Evaluation** - Standardized performance metrics
+3. **Data Processing** - Molecule conversion, filtering, and transformation
+4. **Entity Retrieval** - Database queries and conversions
+
+## 1. Dataset Splits
+
+Dataset splitting is crucial for evaluating model generalization. TDC provides multiple splitting strategies designed for therapeutic ML.
+
+### Basic Split Usage
+
+```python
+from tdc.single_pred import ADME
+
+data = ADME(name='Caco2_Wang')
+
+# Get split with default parameters
+split = data.get_split()
+# Returns: {'train': DataFrame, 'valid': DataFrame, 'test': DataFrame}
+
+# Customize split parameters
+split = data.get_split(
+    method='scaffold',
+    seed=42,
+    frac=[0.7, 0.1, 0.2]
+)
+```
+
+### Split Methods
+
+#### Random Split
+Random shuffling of data - suitable for general ML tasks.
+
+```python
+split = data.get_split(method='random', seed=1)
+```
+
+**When to use:**
+- Baseline model evaluation
+- When chemical/temporal structure is not important
+- Quick prototyping
+
+**Not recommended for:**
+- Realistic drug discovery scenarios
+- Evaluating generalization to new chemical matter
+
+#### Scaffold Split
+Splits based on molecular scaffolds (Bemis-Murcko scaffolds) - ensures test molecules are structurally distinct from training.
+
+```python
+split = data.get_split(method='scaffold', seed=1)
+```
+
+**When to use:**
+- Default for most single prediction tasks
+- Evaluating generalization to new chemical series
+- Realistic drug discovery scenarios
+
+**How it works:**
+1. Extract Bemis-Murcko scaffold from each molecule
+2. Group molecules by scaffold
+3. Assign scaffolds to train/valid/test sets
+4. Ensures test molecules have unseen scaffolds
+
+#### Cold Splits (DTI/DDI Tasks)
+For multi-instance prediction, cold splits ensure test set contains unseen drugs, targets, or both.
+
+**Cold Drug Split:**
+```python
+from tdc.multi_pred import DTI
+data = DTI(name='BindingDB_Kd')
+split = data.get_split(method='cold_drug', seed=1)
+```
+- Test set contains drugs not seen during training
+- Evaluates generalization to new compounds
+
+**Cold Target Split:**
+```python
+split = data.get_split(method='cold_target', seed=1)
+```
+- Test set contains targets not seen during training
+- Evaluates generalization to new proteins
+
+**Cold Drug-Target Split:**
+```python
+split = data.get_split(method='cold_drug_target', seed=1)
+```
+- Test set contains novel drug-target pairs
+- Most challenging evaluation scenario
+
+#### Temporal Split
+For datasets with temporal information - ensures test data is from later time points.
+
+```python
+split = data.get_split(method='temporal', seed=1)
+```
+
+**When to use:**
+- Datasets with time stamps
+- Simulating prospective prediction
+- Clinical trial outcome prediction
+
+### Custom Split Fractions
+
+```python
+# 80% train, 10% valid, 10% test
+split = data.get_split(method='scaffold', frac=[0.8, 0.1, 0.1])
+
+# 70% train, 15% valid, 15% test
+split = data.get_split(method='scaffold', frac=[0.7, 0.15, 0.15])
+```
+
+### Stratified Splits
+
+For classification tasks with imbalanced labels:
+
+```python
+split = data.get_split(method='scaffold', stratified=True)
+```
+
+Maintains label distribution across train/valid/test sets.
+
+## 2. Model Evaluation
+
+TDC provides standardized evaluation metrics for different task types.
+
+### Basic Evaluator Usage
+
+```python
+from tdc import Evaluator
+
+# Initialize evaluator
+evaluator = Evaluator(name='ROC-AUC')
+
+# Evaluate predictions
+score = evaluator(y_true, y_pred)
+```
+
+### Classification Metrics
+
+#### ROC-AUC
+Receiver Operating Characteristic - Area Under Curve
+
+```python
+evaluator = Evaluator(name='ROC-AUC')
+score = evaluator(y_true, y_pred_proba)
+```
+
+**Best for:**
+- Binary classification
+- Imbalanced datasets
+- Overall discriminative ability
+
+**Range:** 0-1 (higher is better, 0.5 is random)
+
+#### PR-AUC
+Precision-Recall Area Under Curve
+
+```python
+evaluator = Evaluator(name='PR-AUC')
+score = evaluator(y_true, y_pred_proba)
+```
+
+**Best for:**
+- Highly imbalanced datasets
+- When positive class is rare
+- Complements ROC-AUC
+
+**Range:** 0-1 (higher is better)
+
+#### F1 Score
+Harmonic mean of precision and recall
+
+```python
+evaluator = Evaluator(name='F1')
+score = evaluator(y_true, y_pred_binary)
+```
+
+**Best for:**
+- Balance between precision and recall
+- Multi-class classification
+
+**Range:** 0-1 (higher is better)
+
+#### Accuracy
+Fraction of correct predictions
+
+```python
+evaluator = Evaluator(name='Accuracy')
+score = evaluator(y_true, y_pred_binary)
+```
+
+**Best for:**
+- Balanced datasets
+- Simple baseline metric
+
+**Not recommended for:** Imbalanced datasets
+
+#### Cohen's Kappa
+Agreement between predictions and ground truth, accounting for chance
+
+```python
+evaluator = Evaluator(name='Kappa')
+score = evaluator(y_true, y_pred_binary)
+```
+
+**Range:** -1 to 1 (higher is better, 0 is random)
+
+### Regression Metrics
+
+#### RMSE - Root Mean Squared Error
+```python
+evaluator = Evaluator(name='RMSE')
+score = evaluator(y_true, y_pred)
+```
+
+**Best for:**
+- Continuous predictions
+- Penalizes large errors heavily
+
+**Range:** 0-∞ (lower is better)
+
+#### MAE - Mean Absolute Error
+```python
+evaluator = Evaluator(name='MAE')
+score = evaluator(y_true, y_pred)
+```
+
+**Best for:**
+- Continuous predictions
+- More robust to outliers than RMSE
+
+**Range:** 0-∞ (lower is better)
+
+#### R² - Coefficient of Determination
+```python
+evaluator = Evaluator(name='R2')
+score = evaluator(y_true, y_pred)
+```
+
+**Best for:**
+- Variance explained by model
+- Comparing different models
+
+**Range:** -∞ to 1 (higher is better, 1 is perfect)
+
+#### MSE - Mean Squared Error
+```python
+evaluator = Evaluator(name='MSE')
+score = evaluator(y_true, y_pred)
+```
+
+**Range:** 0-∞ (lower is better)
+
+### Ranking Metrics
+
+#### Spearman Correlation
+Rank correlation coefficient
+
+```python
+evaluator = Evaluator(name='Spearman')
+score = evaluator(y_true, y_pred)
+```
+
+**Best for:**
+- Ranking tasks
+- Non-linear relationships
+- Ordinal data
+
+**Range:** -1 to 1 (higher is better)
+
+#### Pearson Correlation
+Linear correlation coefficient
+
+```python
+evaluator = Evaluator(name='Pearson')
+score = evaluator(y_true, y_pred)
+```
+
+**Best for:**
+- Linear relationships
+- Continuous data
+
+**Range:** -1 to 1 (higher is better)
+
+### Multi-Label Classification
+
+```python
+evaluator = Evaluator(name='Micro-F1')
+score = evaluator(y_true_multilabel, y_pred_multilabel)
+```
+
+Available: `Micro-F1`, `Macro-F1`, `Micro-AUPR`, `Macro-AUPR`
+
+### Benchmark Group Evaluation
+
+For benchmark groups, evaluation requires multiple seeds:
+
+```python
+from tdc.benchmark_group import admet_group
+
+group = admet_group(path='data/')
+benchmark = group.get('Caco2_Wang')
+
+# Predictions must be dict with seeds as keys
+predictions = {}
+for seed in [1, 2, 3, 4, 5]:
+    # Train model and predict
+    predictions[seed] = model_predictions
+
+# Evaluate with mean and std across seeds
+results = group.evaluate(predictions)
+print(results)  # {'Caco2_Wang': [mean_score, std_score]}
+```
+
+## 3. Data Processing
+
+TDC provides 11 comprehensive data processing utilities.
+
+### Molecule Format Conversion
+
+Convert between ~15 molecular representations.
+
+```python
+from tdc.chem_utils import MolConvert
+
+# SMILES to PyTorch Geometric
+converter = MolConvert(src='SMILES', dst='PyG')
+pyg_graph = converter('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+
+# SMILES to DGL
+converter = MolConvert(src='SMILES', dst='DGL')
+dgl_graph = converter('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+
+# SMILES to Morgan Fingerprint (ECFP)
+converter = MolConvert(src='SMILES', dst='ECFP')
+fingerprint = converter('CC(C)Cc1ccc(cc1)C(C)C(O)=O')
+```
+
+**Available formats:**
+- **Text**: SMILES, SELFIES, InChI
+- **Fingerprints**: ECFP (Morgan), MACCS, RDKit, AtomPair, TopologicalTorsion
+- **Graphs**: PyG (PyTorch Geometric), DGL (Deep Graph Library)
+- **3D**: Graph3D, Coulomb Matrix, Distance Matrix
+
+**Batch conversion:**
+```python
+converter = MolConvert(src='SMILES', dst='PyG')
+graphs = converter(['SMILES1', 'SMILES2', 'SMILES3'])
+```
+
+### Molecule Filters
+
+Remove non-drug-like molecules using curated chemical rules.
+
+```python
+from tdc.chem_utils import MolFilter
+
+# Initialize filter with rules
+mol_filter = MolFilter(
+    rules=['PAINS', 'BMS'],  # Chemical filter rules
+    property_filters_dict={
+        'MW': (150, 500),      # Molecular weight range
+        'LogP': (-0.4, 5.6),   # Lipophilicity range
+        'HBD': (0, 5),         # H-bond donors
+        'HBA': (0, 10)         # H-bond acceptors
+    }
+)
+
+# Filter molecules
+filtered_smiles = mol_filter(smiles_list)
+```
+
+**Available filter rules:**
+- `PAINS` - Pan-Assay Interference Compounds
+- `BMS` - Bristol-Myers Squibb HTS deck filters
+- `Glaxo` - GlaxoSmithKline filters
+- `Dundee` - University of Dundee filters
+- `Inpharmatica` - Inpharmatica filters
+- `LINT` - Pfizer LINT filters
+
+### Label Distribution Visualization
+
+```python
+# Visualize label distribution
+data.label_distribution()
+
+# Print statistics
+data.print_stats()
+```
+
+Displays histogram and computes mean, median, std for continuous labels.
+
+### Label Binarization
+
+Convert continuous labels to binary using threshold.
+
+```python
+from tdc.utils import binarize
+
+# Binarize with threshold
+binary_labels = binarize(y_continuous, threshold=5.0, order='ascending')
+# order='ascending': values >= threshold become 1
+# order='descending': values <= threshold become 1
+```
+
+### Label Units Conversion
+
+Transform between measurement units.
+
+```python
+from tdc.chem_utils import label_transform
+
+# Convert nM to pKd
+y_pkd = label_transform(y_nM, from_unit='nM', to_unit='p')
+
+# Convert μM to nM
+y_nM = label_transform(y_uM, from_unit='uM', to_unit='nM')
+```
+
+**Available conversions:**
+- Binding affinity: nM, μM, pKd, pKi, pIC50
+- Log transformations
+- Natural log conversions
+
+### Label Meaning
+
+Get interpretable descriptions for labels.
+
+```python
+# Get label mapping
+label_map = data.get_label_map(name='DrugBank')
+print(label_map)
+# {0: 'No interaction', 1: 'Increased effect', 2: 'Decreased effect', ...}
+```
+
+### Data Balancing
+
+Handle class imbalance via over/under-sampling.
+
+```python
+from tdc.utils import balance
+
+# Oversample minority class
+X_balanced, y_balanced = balance(X, y, method='oversample')
+
+# Undersample majority class
+X_balanced, y_balanced = balance(X, y, method='undersample')
+```
+
+### Graph Transformation for Pair Data
+
+Convert paired data to graph representations.
+
+```python
+from tdc.utils import create_graph_from_pairs
+
+# Create graph from drug-drug pairs
+graph = create_graph_from_pairs(
+    pairs=ddi_pairs,  # [(drug1, drug2, label), ...]
+    format='edge_list'  # or 'PyG', 'DGL'
+)
+```
+
+### Negative Sampling
+
+Generate negative samples for binary tasks.
+
+```python
+from tdc.utils import negative_sample
+
+# Generate negative samples for DTI
+negative_pairs = negative_sample(
+    positive_pairs=known_interactions,
+    all_drugs=drug_list,
+    all_targets=target_list,
+    ratio=1.0  # Negative:positive ratio
+)
+```
+
+**Use cases:**
+- Drug-target interaction prediction
+- Drug-drug interaction tasks
+- Creating balanced datasets
+
+### Entity Retrieval
+
+Convert between database identifiers.
+
+#### PubChem CID to SMILES
+```python
+from tdc.utils import cid2smiles
+
+smiles = cid2smiles(2244)  # Aspirin
+# Returns: 'CC(=O)Oc1ccccc1C(=O)O'
+```
+
+#### UniProt ID to Amino Acid Sequence
+```python
+from tdc.utils import uniprot2seq
+
+sequence = uniprot2seq('P12345')
+# Returns: 'MVKVYAPASS...'
+```
+
+#### Batch Retrieval
+```python
+# Multiple CIDs
+smiles_list = [cid2smiles(cid) for cid in [2244, 5090, 6323]]
+
+# Multiple UniProt IDs
+sequences = [uniprot2seq(uid) for uid in ['P12345', 'Q9Y5S9']]
+```
+
+## 4. Advanced Utilities
+
+### Retrieve Dataset Names
+
+```python
+from tdc.utils import retrieve_dataset_names
+
+# Get all datasets for a task
+adme_datasets = retrieve_dataset_names('ADME')
+dti_datasets = retrieve_dataset_names('DTI')
+tox_datasets = retrieve_dataset_names('Tox')
+
+print(f"ADME datasets: {adme_datasets}")
+```
+
+### Fuzzy Search
+
+TDC supports fuzzy matching for dataset names:
+
+```python
+from tdc.single_pred import ADME
+
+# These all work (typo-tolerant)
+data = ADME(name='Caco2_Wang')
+data = ADME(name='caco2_wang')
+data = ADME(name='Caco2')  # Partial match
+```
+
+### Data Format Options
+
+```python
+# Pandas DataFrame (default)
+df = data.get_data(format='df')
+
+# Dictionary
+data_dict = data.get_data(format='dict')
+
+# DeepPurpose format (for DeepPurpose library)
+dp_format = data.get_data(format='DeepPurpose')
+
+# PyG/DGL graphs (if applicable)
+graphs = data.get_data(format='PyG')
+```
+
+### Data Loader Utilities
+
+```python
+from tdc.utils import create_fold
+
+# Create cross-validation folds
+folds = create_fold(data, fold=5, seed=42)
+# Returns list of (train_idx, test_idx) tuples
+
+# Iterate through folds
+for i, (train_idx, test_idx) in enumerate(folds):
+    train_data = data.iloc[train_idx]
+    test_data = data.iloc[test_idx]
+    # Train and evaluate
+```
+
+## Common Workflows
+
+### Workflow 1: Complete Data Pipeline
+
+```python
+from tdc.single_pred import ADME
+from tdc import Evaluator
+from tdc.chem_utils import MolConvert, MolFilter
+
+# 1. Load data
+data = ADME(name='Caco2_Wang')
+
+# 2. Filter molecules
+mol_filter = MolFilter(rules=['PAINS'])
+filtered_data = data.get_data()
+filtered_data = filtered_data[
+    filtered_data['Drug'].apply(lambda x: mol_filter([x]))
+]
+
+# 3. Split data
+split = data.get_split(method='scaffold', seed=42)
+train, valid, test = split['train'], split['valid'], split['test']
+
+# 4. Convert to graph representations
+converter = MolConvert(src='SMILES', dst='PyG')
+train_graphs = converter(train['Drug'].tolist())
+
+# 5. Train model (user implements)
+# model.fit(train_graphs, train['Y'])
+
+# 6. Evaluate
+evaluator = Evaluator(name='MAE')
+# score = evaluator(test['Y'], predictions)
+```
+
+### Workflow 2: Multi-Task Learning Preparation
+
+```python
+from tdc.benchmark_group import admet_group
+from tdc.chem_utils import MolConvert
+
+# Load benchmark group
+group = admet_group(path='data/')
+
+# Get multiple datasets
+datasets = ['Caco2_Wang', 'HIA_Hou', 'Bioavailability_Ma']
+all_data = {}
+
+for dataset_name in datasets:
+    benchmark = group.get(dataset_name)
+    all_data[dataset_name] = benchmark
+
+# Prepare for multi-task learning
+converter = MolConvert(src='SMILES', dst='ECFP')
+# Process each dataset...
+```
+
+### Workflow 3: DTI Cold Split Evaluation
+
+```python
+from tdc.multi_pred import DTI
+from tdc import Evaluator
+
+# Load DTI data
+data = DTI(name='BindingDB_Kd')
+
+# Cold drug split
+split = data.get_split(method='cold_drug', seed=42)
+train, test = split['train'], split['test']
+
+# Verify no drug overlap
+train_drugs = set(train['Drug_ID'])
+test_drugs = set(test['Drug_ID'])
+assert len(train_drugs & test_drugs) == 0, "Drug leakage detected!"
+
+# Train and evaluate
+# model.fit(train)
+evaluator = Evaluator(name='RMSE')
+# score = evaluator(test['Y'], predictions)
+```
+
+## Best Practices
+
+1. **Always use meaningful splits** - Use scaffold or cold splits for realistic evaluation
+2. **Multiple seeds** - Run experiments with multiple seeds for robust results
+3. **Appropriate metrics** - Choose metrics that match your task and dataset characteristics
+4. **Data filtering** - Remove PAINS and non-drug-like molecules before training
+5. **Format conversion** - Convert molecules to appropriate format for your model
+6. **Batch processing** - Use batch operations for efficiency with large datasets
+
+## Performance Tips
+
+- Convert molecules in batch mode for faster processing
+- Cache converted representations to avoid recomputation
+- Use appropriate data formats for your framework (PyG, DGL, etc.)
+- Filter data early in the pipeline to reduce computation
+
+## References
+
+- TDC Documentation: https://tdc.readthedocs.io
+- Data Functions: https://tdcommons.ai/fct_overview/
+- Evaluation Metrics: https://tdcommons.ai/functions/model_eval/
+- Data Splits: https://tdcommons.ai/functions/data_split/
--- a/skills/pytdc/scripts/benchmark_evaluation.py
+++ b/skills/pytdc/scripts/benchmark_evaluation.py
@@ -0,0 +1,327 @@
+#!/usr/bin/env python3
+"""
+TDC Benchmark Group Evaluation Template
+
+This script demonstrates how to use TDC benchmark groups for systematic
+model evaluation following the required 5-seed protocol.
+
+Usage:
+    python benchmark_evaluation.py
+"""
+
+from tdc.benchmark_group import admet_group
+from tdc import Evaluator
+import numpy as np
+import pandas as pd
+
+
+def load_benchmark_group():
+    """
+    Load the ADMET benchmark group
+    """
+    print("=" * 60)
+    print("Loading ADMET Benchmark Group")
+    print("=" * 60)
+
+    # Initialize benchmark group
+    group = admet_group(path='data/')
+
+    # Get available benchmarks
+    print("\nAvailable benchmarks in ADMET group:")
+    benchmark_names = group.dataset_names
+    print(f"Total: {len(benchmark_names)} datasets")
+
+    for i, name in enumerate(benchmark_names[:10], 1):
+        print(f"  {i}. {name}")
+
+    if len(benchmark_names) > 10:
+        print(f"  ... and {len(benchmark_names) - 10} more")
+
+    return group
+
+
+def single_dataset_evaluation(group, dataset_name='Caco2_Wang'):
+    """
+    Example: Evaluate on a single dataset with 5-seed protocol
+    """
+    print("\n" + "=" * 60)
+    print(f"Example 1: Single Dataset Evaluation ({dataset_name})")
+    print("=" * 60)
+
+    # Get dataset benchmarks
+    benchmark = group.get(dataset_name)
+
+    print(f"\nBenchmark structure:")
+    print(f"  Seeds: {list(benchmark.keys())}")
+
+    # Required: Evaluate with 5 different seeds
+    predictions = {}
+
+    for seed in [1, 2, 3, 4, 5]:
+        print(f"\n--- Seed {seed} ---")
+
+        # Get train/valid data for this seed
+        train = benchmark[seed]['train']
+        valid = benchmark[seed]['valid']
+
+        print(f"Train size: {len(train)}")
+        print(f"Valid size: {len(valid)}")
+
+        # TODO: Replace with your model training
+        # model = YourModel()
+        # model.fit(train['Drug'], train['Y'])
+
+        # For demonstration, create dummy predictions
+        # Replace with: predictions[seed] = model.predict(benchmark[seed]['test'])
+        test = benchmark[seed]['test']
+        y_true = test['Y'].values
+
+        # Simulate predictions (add controlled noise)
+        np.random.seed(seed)
+        y_pred = y_true + np.random.normal(0, 0.3, len(y_true))
+
+        predictions[seed] = y_pred
+
+        # Evaluate this seed
+        evaluator = Evaluator(name='MAE')
+        score = evaluator(y_true, y_pred)
+        print(f"MAE for seed {seed}: {score:.4f}")
+
+    # Evaluate across all seeds
+    print("\n--- Overall Evaluation ---")
+    results = group.evaluate(predictions)
+
+    print(f"\nResults for {dataset_name}:")
+    mean_score, std_score = results[dataset_name]
+    print(f"  Mean MAE: {mean_score:.4f}")
+    print(f"  Std MAE: {std_score:.4f}")
+
+    return predictions, results
+
+
+def multiple_datasets_evaluation(group):
+    """
+    Example: Evaluate on multiple datasets
+    """
+    print("\n" + "=" * 60)
+    print("Example 2: Multiple Datasets Evaluation")
+    print("=" * 60)
+
+    # Select a subset of datasets for demonstration
+    selected_datasets = ['Caco2_Wang', 'HIA_Hou', 'Bioavailability_Ma']
+
+    all_predictions = {}
+    all_results = {}
+
+    for dataset_name in selected_datasets:
+        print(f"\n{'='*40}")
+        print(f"Evaluating: {dataset_name}")
+        print(f"{'='*40}")
+
+        benchmark = group.get(dataset_name)
+        predictions = {}
+
+        # Train and predict for each seed
+        for seed in [1, 2, 3, 4, 5]:
+            train = benchmark[seed]['train']
+            test = benchmark[seed]['test']
+
+            # TODO: Replace with your model
+            # model = YourModel()
+            # model.fit(train['Drug'], train['Y'])
+            # predictions[seed] = model.predict(test['Drug'])
+
+            # Dummy predictions for demonstration
+            np.random.seed(seed)
+            y_true = test['Y'].values
+            y_pred = y_true + np.random.normal(0, 0.3, len(y_true))
+            predictions[seed] = y_pred
+
+        all_predictions[dataset_name] = predictions
+
+        # Evaluate this dataset
+        results = group.evaluate({dataset_name: predictions})
+        all_results[dataset_name] = results[dataset_name]
+
+        mean_score, std_score = results[dataset_name]
+        print(f"  {dataset_name}: {mean_score:.4f} ± {std_score:.4f}")
+
+    # Summary
+    print("\n" + "=" * 60)
+    print("Summary of Results")
+    print("=" * 60)
+
+    results_df = pd.DataFrame([
+        {
+            'Dataset': name,
+            'Mean MAE': f"{mean:.4f}",
+            'Std MAE': f"{std:.4f}"
+        }
+        for name, (mean, std) in all_results.items()
+    ])
+
+    print(results_df.to_string(index=False))
+
+    return all_predictions, all_results
+
+
+def custom_model_template():
+    """
+    Template for integrating your own model with TDC benchmarks
+    """
+    print("\n" + "=" * 60)
+    print("Example 3: Custom Model Template")
+    print("=" * 60)
+
+    code_template = '''
+# Template for using your own model with TDC benchmarks
+
+from tdc.benchmark_group import admet_group
+from your_library import YourModel  # Replace with your model
+
+# Initialize benchmark group
+group = admet_group(path='data/')
+benchmark = group.get('Caco2_Wang')
+
+predictions = {}
+
+for seed in [1, 2, 3, 4, 5]:
+    # Get data for this seed
+    train = benchmark[seed]['train']
+    valid = benchmark[seed]['valid']
+    test = benchmark[seed]['test']
+
+    # Extract features and labels
+    X_train, y_train = train['Drug'], train['Y']
+    X_valid, y_valid = valid['Drug'], valid['Y']
+    X_test = test['Drug']
+
+    # Initialize and train model
+    model = YourModel(random_state=seed)
+    model.fit(X_train, y_train)
+
+    # Optionally use validation set for early stopping
+    # model.fit(X_train, y_train, validation_data=(X_valid, y_valid))
+
+    # Make predictions on test set
+    predictions[seed] = model.predict(X_test)
+
+# Evaluate with TDC
+results = group.evaluate(predictions)
+print(f"Results: {results}")
+'''
+
+    print("\nCustom Model Integration Template:")
+    print("=" * 60)
+    print(code_template)
+
+    return code_template
+
+
+def multi_seed_statistics(predictions_dict):
+    """
+    Example: Analyzing multi-seed prediction statistics
+    """
+    print("\n" + "=" * 60)
+    print("Example 4: Multi-Seed Statistics Analysis")
+    print("=" * 60)
+
+    # Analyze prediction variability across seeds
+    all_preds = np.array([predictions_dict[seed] for seed in [1, 2, 3, 4, 5]])
+
+    print("\nPrediction statistics across 5 seeds:")
+    print(f"  Shape: {all_preds.shape}")
+    print(f"  Mean prediction: {all_preds.mean():.4f}")
+    print(f"  Std across seeds: {all_preds.std(axis=0).mean():.4f}")
+    print(f"  Min prediction: {all_preds.min():.4f}")
+    print(f"  Max prediction: {all_preds.max():.4f}")
+
+    # Per-sample variance
+    per_sample_std = all_preds.std(axis=0)
+    print(f"\nPer-sample prediction std:")
+    print(f"  Mean: {per_sample_std.mean():.4f}")
+    print(f"  Median: {np.median(per_sample_std):.4f}")
+    print(f"  Max: {per_sample_std.max():.4f}")
+
+
+def leaderboard_submission_guide():
+    """
+    Guide for submitting to TDC leaderboards
+    """
+    print("\n" + "=" * 60)
+    print("Example 5: Leaderboard Submission Guide")
+    print("=" * 60)
+
+    guide = """
+To submit results to TDC leaderboards:
+
+1. Evaluate your model following the 5-seed protocol:
+   - Use seeds [1, 2, 3, 4, 5] exactly as provided
+   - Do not modify the train/valid/test splits
+   - Report mean ± std across all 5 seeds
+
+2. Format your results:
+   results = group.evaluate(predictions)
+   # Returns: {'dataset_name': [mean_score, std_score]}
+
+3. Submit to leaderboard:
+   - Visit: https://tdcommons.ai/benchmark/admet_group/
+   - Click on your dataset of interest
+   - Submit your results with:
+     * Model name and description
+     * Mean score ± standard deviation
+     * Reference to paper/code (if available)
+
+4. Best practices:
+   - Report all datasets in the benchmark group
+   - Include model hyperparameters
+   - Share code for reproducibility
+   - Compare against baseline models
+
+5. Evaluation metrics:
+   - ADMET Group uses MAE by default
+   - Other groups may use different metrics
+   - Check benchmark-specific requirements
+"""
+
+    print(guide)
+
+
+def main():
+    """
+    Main function to run all benchmark evaluation examples
+    """
+    print("\n" + "=" * 60)
+    print("TDC Benchmark Group Evaluation Examples")
+    print("=" * 60)
+
+    # Load benchmark group
+    group = load_benchmark_group()
+
+    # Example 1: Single dataset evaluation
+    predictions, results = single_dataset_evaluation(group)
+
+    # Example 2: Multiple datasets evaluation
+    all_predictions, all_results = multiple_datasets_evaluation(group)
+
+    # Example 3: Custom model template
+    custom_model_template()
+
+    # Example 4: Multi-seed statistics
+    multi_seed_statistics(predictions)
+
+    # Example 5: Leaderboard submission guide
+    leaderboard_submission_guide()
+
+    print("\n" + "=" * 60)
+    print("Benchmark evaluation examples completed!")
+    print("=" * 60)
+    print("\nNext steps:")
+    print("1. Replace dummy predictions with your model")
+    print("2. Run full evaluation on all benchmark datasets")
+    print("3. Submit results to TDC leaderboard")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
--- a/skills/pytdc/scripts/load_and_split_data.py
+++ b/skills/pytdc/scripts/load_and_split_data.py
@@ -0,0 +1,214 @@
+#!/usr/bin/env python3
+"""
+TDC Data Loading and Splitting Template
+
+This script demonstrates how to load TDC datasets and apply different
+splitting strategies for model training and evaluation.
+
+Usage:
+    python load_and_split_data.py
+"""
+
+from tdc.single_pred import ADME
+from tdc.multi_pred import DTI
+from tdc import Evaluator
+import pandas as pd
+
+
+def load_single_pred_example():
+    """
+    Example: Loading and splitting a single-prediction dataset (ADME)
+    """
+    print("=" * 60)
+    print("Example 1: Single-Prediction Task (ADME)")
+    print("=" * 60)
+
+    # Load Caco2 dataset (intestinal permeability)
+    print("\nLoading Caco2_Wang dataset...")
+    data = ADME(name='Caco2_Wang')
+
+    # Get basic dataset info
+    print(f"\nDataset size: {len(data.get_data())} molecules")
+    data.print_stats()
+
+    # Method 1: Scaffold split (default, recommended)
+    print("\n--- Scaffold Split ---")
+    split = data.get_split(method='scaffold', seed=42, frac=[0.7, 0.1, 0.2])
+
+    train = split['train']
+    valid = split['valid']
+    test = split['test']
+
+    print(f"Train: {len(train)} molecules")
+    print(f"Valid: {len(valid)} molecules")
+    print(f"Test: {len(test)} molecules")
+
+    # Display sample data
+    print("\nSample training data:")
+    print(train.head(3))
+
+    # Method 2: Random split
+    print("\n--- Random Split ---")
+    split_random = data.get_split(method='random', seed=42, frac=[0.8, 0.1, 0.1])
+    print(f"Train: {len(split_random['train'])} molecules")
+    print(f"Valid: {len(split_random['valid'])} molecules")
+    print(f"Test: {len(split_random['test'])} molecules")
+
+    return split
+
+
+def load_multi_pred_example():
+    """
+    Example: Loading and splitting a multi-prediction dataset (DTI)
+    """
+    print("\n" + "=" * 60)
+    print("Example 2: Multi-Prediction Task (DTI)")
+    print("=" * 60)
+
+    # Load BindingDB Kd dataset (drug-target interactions)
+    print("\nLoading BindingDB_Kd dataset...")
+    data = DTI(name='BindingDB_Kd')
+
+    # Get basic dataset info
+    full_data = data.get_data()
+    print(f"\nDataset size: {len(full_data)} drug-target pairs")
+    print(f"Unique drugs: {full_data['Drug_ID'].nunique()}")
+    print(f"Unique targets: {full_data['Target_ID'].nunique()}")
+
+    # Method 1: Random split
+    print("\n--- Random Split ---")
+    split_random = data.get_split(method='random', seed=42)
+    print(f"Train: {len(split_random['train'])} pairs")
+    print(f"Valid: {len(split_random['valid'])} pairs")
+    print(f"Test: {len(split_random['test'])} pairs")
+
+    # Method 2: Cold drug split (unseen drugs in test)
+    print("\n--- Cold Drug Split ---")
+    split_cold_drug = data.get_split(method='cold_drug', seed=42)
+
+    train = split_cold_drug['train']
+    test = split_cold_drug['test']
+
+    # Verify no drug overlap
+    train_drugs = set(train['Drug_ID'])
+    test_drugs = set(test['Drug_ID'])
+    overlap = train_drugs & test_drugs
+
+    print(f"Train: {len(train)} pairs, {len(train_drugs)} unique drugs")
+    print(f"Test: {len(test)} pairs, {len(test_drugs)} unique drugs")
+    print(f"Drug overlap: {len(overlap)} (should be 0)")
+
+    # Method 3: Cold target split (unseen targets in test)
+    print("\n--- Cold Target Split ---")
+    split_cold_target = data.get_split(method='cold_target', seed=42)
+
+    train = split_cold_target['train']
+    test = split_cold_target['test']
+
+    train_targets = set(train['Target_ID'])
+    test_targets = set(test['Target_ID'])
+    overlap = train_targets & test_targets
+
+    print(f"Train: {len(train)} pairs, {len(train_targets)} unique targets")
+    print(f"Test: {len(test)} pairs, {len(test_targets)} unique targets")
+    print(f"Target overlap: {len(overlap)} (should be 0)")
+
+    # Display sample data
+    print("\nSample DTI data:")
+    print(full_data.head(3))
+
+    return split_cold_drug
+
+
+def evaluation_example(split):
+    """
+    Example: Evaluating model predictions with TDC evaluators
+    """
+    print("\n" + "=" * 60)
+    print("Example 3: Model Evaluation")
+    print("=" * 60)
+
+    test = split['test']
+
+    # For demonstration, create dummy predictions
+    # In practice, replace with your model's predictions
+    import numpy as np
+    np.random.seed(42)
+
+    # Simulate predictions (replace with model.predict(test['Drug']))
+    y_true = test['Y'].values
+    y_pred = y_true + np.random.normal(0, 0.5, len(y_true))  # Add noise
+
+    # Evaluate with different metrics
+    print("\nEvaluating predictions...")
+
+    # Regression metrics
+    mae_evaluator = Evaluator(name='MAE')
+    mae = mae_evaluator(y_true, y_pred)
+    print(f"MAE: {mae:.4f}")
+
+    rmse_evaluator = Evaluator(name='RMSE')
+    rmse = rmse_evaluator(y_true, y_pred)
+    print(f"RMSE: {rmse:.4f}")
+
+    r2_evaluator = Evaluator(name='R2')
+    r2 = r2_evaluator(y_true, y_pred)
+    print(f"R²: {r2:.4f}")
+
+    spearman_evaluator = Evaluator(name='Spearman')
+    spearman = spearman_evaluator(y_true, y_pred)
+    print(f"Spearman: {spearman:.4f}")
+
+
+def custom_split_example():
+    """
+    Example: Creating custom splits with different fractions
+    """
+    print("\n" + "=" * 60)
+    print("Example 4: Custom Split Fractions")
+    print("=" * 60)
+
+    data = ADME(name='HIA_Hou')
+
+    # Custom split fractions
+    custom_fracs = [
+        ([0.6, 0.2, 0.2], "60/20/20 split"),
+        ([0.8, 0.1, 0.1], "80/10/10 split"),
+        ([0.7, 0.15, 0.15], "70/15/15 split")
+    ]
+
+    for frac, description in custom_fracs:
+        split = data.get_split(method='scaffold', seed=42, frac=frac)
+        print(f"\n{description}:")
+        print(f"  Train: {len(split['train'])} ({frac[0]*100:.0f}%)")
+        print(f"  Valid: {len(split['valid'])} ({frac[1]*100:.0f}%)")
+        print(f"  Test: {len(split['test'])} ({frac[2]*100:.0f}%)")
+
+
+def main():
+    """
+    Main function to run all examples
+    """
+    print("\n" + "=" * 60)
+    print("TDC Data Loading and Splitting Examples")
+    print("=" * 60)
+
+    # Example 1: Single prediction with scaffold split
+    split = load_single_pred_example()
+
+    # Example 2: Multi prediction with cold splits
+    dti_split = load_multi_pred_example()
+
+    # Example 3: Model evaluation
+    evaluation_example(split)
+
+    # Example 4: Custom split fractions
+    custom_split_example()
+
+    print("\n" + "=" * 60)
+    print("Examples completed!")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
--- a/skills/pytdc/scripts/molecular_generation.py
+++ b/skills/pytdc/scripts/molecular_generation.py
@@ -0,0 +1,404 @@
+#!/usr/bin/env python3
+"""
+TDC Molecular Generation with Oracles Template
+
+This script demonstrates how to use TDC oracles for molecular generation
+tasks including goal-directed generation and distribution learning.
+
+Usage:
+    python molecular_generation.py
+"""
+
+from tdc.generation import MolGen
+from tdc import Oracle
+import numpy as np
+
+
+def load_generation_dataset():
+    """
+    Load molecular generation dataset
+    """
+    print("=" * 60)
+    print("Loading Molecular Generation Dataset")
+    print("=" * 60)
+
+    # Load ChEMBL dataset
+    data = MolGen(name='ChEMBL_V29')
+
+    # Get training molecules
+    split = data.get_split()
+    train_smiles = split['train']['Drug'].tolist()
+
+    print(f"\nDataset: ChEMBL_V29")
+    print(f"Training molecules: {len(train_smiles)}")
+
+    # Display sample molecules
+    print("\nSample SMILES:")
+    for i, smiles in enumerate(train_smiles[:5], 1):
+        print(f"  {i}. {smiles}")
+
+    return train_smiles
+
+
+def single_oracle_example():
+    """
+    Example: Using a single oracle for molecular evaluation
+    """
+    print("\n" + "=" * 60)
+    print("Example 1: Single Oracle Evaluation")
+    print("=" * 60)
+
+    # Initialize oracle for GSK3B target
+    oracle = Oracle(name='GSK3B')
+
+    # Test molecules
+    test_molecules = [
+        'CC(C)Cc1ccc(cc1)C(C)C(O)=O',  # Ibuprofen
+        'CC(=O)Oc1ccccc1C(=O)O',        # Aspirin
+        'Cn1c(=O)c2c(ncn2C)n(C)c1=O',   # Caffeine
+        'CN1C=NC2=C1C(=O)N(C(=O)N2C)C'  # Theophylline
+    ]
+
+    print("\nEvaluating molecules with GSK3B oracle:")
+    print("-" * 60)
+
+    for smiles in test_molecules:
+        score = oracle(smiles)
+        print(f"SMILES: {smiles}")
+        print(f"GSK3B score: {score:.4f}\n")
+
+
+def multiple_oracles_example():
+    """
+    Example: Using multiple oracles for multi-objective optimization
+    """
+    print("\n" + "=" * 60)
+    print("Example 2: Multiple Oracles (Multi-Objective)")
+    print("=" * 60)
+
+    # Initialize multiple oracles
+    oracles = {
+        'QED': Oracle(name='QED'),        # Drug-likeness
+        'SA': Oracle(name='SA'),          # Synthetic accessibility
+        'GSK3B': Oracle(name='GSK3B'),    # Target binding
+        'LogP': Oracle(name='LogP')       # Lipophilicity
+    }
+
+    # Test molecule
+    test_smiles = 'CC(C)Cc1ccc(cc1)C(C)C(O)=O'
+
+    print(f"\nEvaluating: {test_smiles}")
+    print("-" * 60)
+
+    scores = {}
+    for name, oracle in oracles.items():
+        score = oracle(test_smiles)
+        scores[name] = score
+        print(f"{name:10s}: {score:.4f}")
+
+    # Multi-objective score (weighted combination)
+    print("\n--- Multi-Objective Scoring ---")
+
+    # Invert SA (lower is better, so we invert for maximization)
+    sa_score = 1.0 / (1.0 + scores['SA'])
+
+    # Weighted combination
+    weights = {'QED': 0.3, 'SA': 0.2, 'GSK3B': 0.4, 'LogP': 0.1}
+    multi_score = (
+        weights['QED'] * scores['QED'] +
+        weights['SA'] * sa_score +
+        weights['GSK3B'] * scores['GSK3B'] +
+        weights['LogP'] * (scores['LogP'] / 5.0)  # Normalize LogP
+    )
+
+    print(f"Multi-objective score: {multi_score:.4f}")
+    print(f"Weights: {weights}")
+
+
+def batch_evaluation_example():
+    """
+    Example: Batch evaluation of multiple molecules
+    """
+    print("\n" + "=" * 60)
+    print("Example 3: Batch Evaluation")
+    print("=" * 60)
+
+    # Generate sample molecules
+    molecules = [
+        'CC(C)Cc1ccc(cc1)C(C)C(O)=O',
+        'CC(=O)Oc1ccccc1C(=O)O',
+        'Cn1c(=O)c2c(ncn2C)n(C)c1=O',
+        'CN1C=NC2=C1C(=O)N(C(=O)N2C)C',
+        'CC(C)NCC(COc1ccc(cc1)COCCOC(C)C)O'
+    ]
+
+    # Initialize oracle
+    oracle = Oracle(name='DRD2')
+
+    print(f"\nBatch evaluating {len(molecules)} molecules with DRD2 oracle...")
+
+    # Batch evaluation (more efficient than individual calls)
+    scores = oracle(molecules)
+
+    print("\nResults:")
+    print("-" * 60)
+    for smiles, score in zip(molecules, scores):
+        print(f"{smiles[:40]:40s}... Score: {score:.4f}")
+
+    # Statistics
+    print(f"\nStatistics:")
+    print(f"  Mean score: {np.mean(scores):.4f}")
+    print(f"  Std score: {np.std(scores):.4f}")
+    print(f"  Min score: {np.min(scores):.4f}")
+    print(f"  Max score: {np.max(scores):.4f}")
+
+
+def goal_directed_generation_template():
+    """
+    Template for goal-directed molecular generation
+    """
+    print("\n" + "=" * 60)
+    print("Example 4: Goal-Directed Generation Template")
+    print("=" * 60)
+
+    template = '''
+# Template for goal-directed molecular generation
+
+from tdc.generation import MolGen
+from tdc import Oracle
+import numpy as np
+
+# 1. Load training data
+data = MolGen(name='ChEMBL_V29')
+train_smiles = data.get_split()['train']['Drug'].tolist()
+
+# 2. Initialize oracle(s)
+oracle = Oracle(name='GSK3B')
+
+# 3. Initialize your generative model
+# model = YourGenerativeModel()
+# model.fit(train_smiles)
+
+# 4. Generation loop
+num_iterations = 100
+num_molecules_per_iter = 100
+best_molecules = []
+
+for iteration in range(num_iterations):
+    # Generate candidate molecules
+    # candidates = model.generate(num_molecules_per_iter)
+
+    # Evaluate with oracle
+    scores = oracle(candidates)
+
+    # Select top molecules
+    top_indices = np.argsort(scores)[-10:]
+    top_molecules = [candidates[i] for i in top_indices]
+    top_scores = [scores[i] for i in top_indices]
+
+    # Store best molecules
+    best_molecules.extend(zip(top_molecules, top_scores))
+
+    # Optional: Fine-tune model on top molecules
+    # model.fine_tune(top_molecules)
+
+    # Print progress
+    print(f"Iteration {iteration}: Best score = {max(scores):.4f}")
+
+# Sort and display top molecules
+best_molecules.sort(key=lambda x: x[1], reverse=True)
+print("\\nTop 10 molecules:")
+for smiles, score in best_molecules[:10]:
+    print(f"{smiles}: {score:.4f}")
+'''
+
+    print("\nGoal-Directed Generation Template:")
+    print("=" * 60)
+    print(template)
+
+
+def distribution_learning_example(train_smiles):
+    """
+    Example: Distribution learning evaluation
+    """
+    print("\n" + "=" * 60)
+    print("Example 5: Distribution Learning")
+    print("=" * 60)
+
+    # Use subset for demonstration
+    train_subset = train_smiles[:1000]
+
+    # Initialize oracle
+    oracle = Oracle(name='QED')
+
+    print("\nEvaluating property distribution...")
+
+    # Evaluate training set
+    print("Computing training set distribution...")
+    train_scores = oracle(train_subset)
+
+    # Simulate generated molecules (in practice, use your generative model)
+    # For demo: add noise to training molecules
+    print("Computing generated set distribution...")
+    generated_scores = train_scores + np.random.normal(0, 0.1, len(train_scores))
+    generated_scores = np.clip(generated_scores, 0, 1)  # QED is [0, 1]
+
+    # Compare distributions
+    print("\n--- Distribution Statistics ---")
+    print(f"Training set (n={len(train_subset)}):")
+    print(f"  Mean: {np.mean(train_scores):.4f}")
+    print(f"  Std: {np.std(train_scores):.4f}")
+    print(f"  Median: {np.median(train_scores):.4f}")
+
+    print(f"\nGenerated set (n={len(generated_scores)}):")
+    print(f"  Mean: {np.mean(generated_scores):.4f}")
+    print(f"  Std: {np.std(generated_scores):.4f}")
+    print(f"  Median: {np.median(generated_scores):.4f}")
+
+    # Distribution similarity metrics
+    from scipy.stats import ks_2samp
+    ks_statistic, p_value = ks_2samp(train_scores, generated_scores)
+
+    print(f"\nKolmogorov-Smirnov Test:")
+    print(f"  KS statistic: {ks_statistic:.4f}")
+    print(f"  P-value: {p_value:.4f}")
+
+    if p_value > 0.05:
+        print("  → Distributions are similar (p > 0.05)")
+    else:
+        print("  → Distributions are significantly different (p < 0.05)")
+
+
+def available_oracles_info():
+    """
+    Display information about available oracles
+    """
+    print("\n" + "=" * 60)
+    print("Example 6: Available Oracles")
+    print("=" * 60)
+
+    oracle_info = {
+        'Biochemical Targets': [
+            'DRD2', 'GSK3B', 'JNK3', '5HT2A', 'ACE',
+            'MAPK', 'CDK', 'P38', 'PARP1', 'PIK3CA'
+        ],
+        'Physicochemical Properties': [
+            'QED', 'SA', 'LogP', 'MW', 'Lipinski'
+        ],
+        'Composite Metrics': [
+            'Isomer_Meta', 'Median1', 'Median2',
+            'Rediscovery', 'Similarity', 'Uniqueness', 'Novelty'
+        ],
+        'Specialized': [
+            'ASKCOS', 'Docking', 'Vina'
+        ]
+    }
+
+    print("\nAvailable Oracle Categories:")
+    print("-" * 60)
+
+    for category, oracles in oracle_info.items():
+        print(f"\n{category}:")
+        for oracle_name in oracles:
+            print(f"  - {oracle_name}")
+
+    print("\nFor detailed oracle documentation, see:")
+    print("  references/oracles.md")
+
+
+def constraint_satisfaction_example():
+    """
+    Example: Molecular generation with constraints
+    """
+    print("\n" + "=" * 60)
+    print("Example 7: Constraint Satisfaction")
+    print("=" * 60)
+
+    # Define constraints
+    constraints = {
+        'QED': (0.5, 1.0),      # Drug-likeness >= 0.5
+        'SA': (1.0, 5.0),       # Easy to synthesize
+        'MW': (200, 500),       # Molecular weight 200-500 Da
+        'LogP': (0, 3)          # Lipophilicity 0-3
+    }
+
+    # Initialize oracles
+    oracles = {name: Oracle(name=name) for name in constraints.keys()}
+
+    # Test molecules
+    test_molecules = [
+        'CC(C)Cc1ccc(cc1)C(C)C(O)=O',
+        'CC(=O)Oc1ccccc1C(=O)O',
+        'Cn1c(=O)c2c(ncn2C)n(C)c1=O'
+    ]
+
+    print("\nConstraints:")
+    for prop, (min_val, max_val) in constraints.items():
+        print(f"  {prop}: [{min_val}, {max_val}]")
+
+    print("\n" + "-" * 60)
+    print("Evaluating molecules against constraints:")
+    print("-" * 60)
+
+    for smiles in test_molecules:
+        print(f"\nSMILES: {smiles}")
+
+        satisfies_all = True
+        for prop, (min_val, max_val) in constraints.items():
+            score = oracles[prop](smiles)
+            satisfies = min_val <= score <= max_val
+
+            status = "✓" if satisfies else "✗"
+            print(f"  {prop:10s}: {score:7.2f} [{min_val:5.1f}, {max_val:5.1f}] {status}")
+
+            satisfies_all = satisfies_all and satisfies
+
+        result = "PASS" if satisfies_all else "FAIL"
+        print(f"  Overall: {result}")
+
+
+def main():
+    """
+    Main function to run all molecular generation examples
+    """
+    print("\n" + "=" * 60)
+    print("TDC Molecular Generation with Oracles Examples")
+    print("=" * 60)
+
+    # Load generation dataset
+    train_smiles = load_generation_dataset()
+
+    # Example 1: Single oracle
+    single_oracle_example()
+
+    # Example 2: Multiple oracles
+    multiple_oracles_example()
+
+    # Example 3: Batch evaluation
+    batch_evaluation_example()
+
+    # Example 4: Goal-directed generation template
+    goal_directed_generation_template()
+
+    # Example 5: Distribution learning
+    distribution_learning_example(train_smiles)
+
+    # Example 6: Available oracles
+    available_oracles_info()
+
+    # Example 7: Constraint satisfaction
+    constraint_satisfaction_example()
+
+    print("\n" + "=" * 60)
+    print("Molecular generation examples completed!")
+    print("=" * 60)
+    print("\nNext steps:")
+    print("1. Implement your generative model")
+    print("2. Use oracles to guide generation")
+    print("3. Evaluate generated molecules")
+    print("4. Iterate and optimize")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()