Initial commit
This commit is contained in:
401
skills/opentargets-database/references/target_annotations.md
Normal file
401
skills/opentargets-database/references/target_annotations.md
Normal file
@@ -0,0 +1,401 @@
|
||||
# Target Annotations and Features
|
||||
|
||||
## Overview
|
||||
|
||||
Open Targets defines a target as "any naturally-occurring molecule that can be targeted by a medicinal product." Targets are primarily protein-coding genes identified by Ensembl gene IDs, but also include RNAs and pseudogenes from canonical chromosomes.
|
||||
|
||||
## Core Target Annotations
|
||||
|
||||
### 1. Tractability Assessment
|
||||
|
||||
Tractability evaluates the druggability potential of a target across different modalities.
|
||||
|
||||
#### Modalities Assessed:
|
||||
|
||||
**Small Molecule**
|
||||
- Prediction of small molecule druggability
|
||||
- Based on structural features, chemical precedence
|
||||
- Buckets: Clinical precedence, Discovery precedence, Predicted tractable
|
||||
|
||||
**Antibody**
|
||||
- Likelihood of antibody-based therapeutic success
|
||||
- Cell surface/secreted protein location
|
||||
- Precedence categories similar to small molecules
|
||||
|
||||
**PROTAC (Protein Degradation)**
|
||||
- Assessment for targeted protein degradation
|
||||
- E3 ligase compatibility
|
||||
- Emerging modality category
|
||||
|
||||
**Other Modalities**
|
||||
- Gene therapy, RNA-based therapeutics
|
||||
- Oligonucleotide approaches
|
||||
|
||||
#### Tractability Levels:
|
||||
|
||||
1. **Clinical Precedence** - Target of approved/clinical drug with similar mechanism
|
||||
2. **Discovery Precedence** - Target of tool compounds or compounds in preclinical development
|
||||
3. **Predicted Tractable** - Computational predictions suggest druggability
|
||||
4. **Unknown** - Insufficient data to assess
|
||||
|
||||
### 2. Safety Liabilities
|
||||
|
||||
Safety information aggregated from multiple sources to identify potential toxicity concerns.
|
||||
|
||||
#### Data Sources:
|
||||
|
||||
**ToxCast**
|
||||
- High-throughput toxicology screening data
|
||||
- In vitro assay results
|
||||
- Toxicity pathway activation
|
||||
|
||||
**AOPWiki (Adverse Outcome Pathways)**
|
||||
- Mechanistic pathways from molecular initiating event to adverse outcome
|
||||
- Systems toxicology frameworks
|
||||
|
||||
**PharmGKB**
|
||||
- Pharmacogenomic relationships
|
||||
- Genetic variants affecting drug response and toxicity
|
||||
|
||||
**Published Literature**
|
||||
- Expert-curated safety concerns from publications
|
||||
- Clinical trial adverse events
|
||||
|
||||
#### Safety Flags:
|
||||
|
||||
- **Organ toxicity** - Liver, kidney, cardiac effects
|
||||
- **Target safety liability** - Known on-target toxic effects
|
||||
- **Off-target effects** - Unintended activity concerns
|
||||
- **Clinical observations** - Adverse events from drugs targeting gene
|
||||
|
||||
### 3. Baseline Expression
|
||||
|
||||
Gene/protein expression across tissues and cell types from multiple sources.
|
||||
|
||||
#### Data Sources:
|
||||
|
||||
**Expression Atlas**
|
||||
- RNA-Seq expression across tissues/conditions
|
||||
- Normalized expression levels (TPM, FPKM)
|
||||
- Differential expression studies
|
||||
|
||||
**GTEx (Genotype-Tissue Expression)**
|
||||
- Comprehensive tissue expression from healthy donors
|
||||
- Median TPM across 53 tissues
|
||||
- Expression variation analysis
|
||||
|
||||
**Human Protein Atlas**
|
||||
- Protein expression via immunohistochemistry
|
||||
- Subcellular localization
|
||||
- Tissue specificity classifications
|
||||
|
||||
#### Expression Metrics:
|
||||
|
||||
- **TPM (Transcripts Per Million)** - Normalized RNA abundance
|
||||
- **Tissue specificity** - Enrichment in specific tissues
|
||||
- **Protein level** - Correlation with RNA expression
|
||||
- **Subcellular location** - Where protein is found in cell
|
||||
|
||||
### 4. Molecular Interactions
|
||||
|
||||
Protein-protein interactions, complex memberships, and molecular partnerships.
|
||||
|
||||
#### Interaction Types:
|
||||
|
||||
**Physical Interactions**
|
||||
- Direct protein-protein binding
|
||||
- Complex components
|
||||
- Sources: IntAct, BioGRID, STRING
|
||||
|
||||
**Pathway Membership**
|
||||
- Biological pathways from Reactome
|
||||
- Functional relationships
|
||||
- Upstream/downstream regulators
|
||||
|
||||
**Target Interactors**
|
||||
- Direct interactors relevant to disease associations
|
||||
- Context-specific interactions
|
||||
|
||||
### 5. Gene Essentiality
|
||||
|
||||
Dependency data indicating if gene is essential for cell survival.
|
||||
|
||||
#### Data Sources:
|
||||
|
||||
**Project Score**
|
||||
- CRISPR-Cas9 fitness screens
|
||||
- 300+ cancer cell lines
|
||||
- Scaled essentiality scores (0-1)
|
||||
|
||||
**DepMap Portal**
|
||||
- Large-scale cancer dependency data
|
||||
- Genetic and pharmacological perturbations
|
||||
- Common essential genes identification
|
||||
|
||||
#### Essentiality Metrics:
|
||||
|
||||
- **Score range**: 0 (non-essential) to 1 (essential)
|
||||
- **Context**: Cell line specific vs. pan-essential
|
||||
- **Therapeutic window**: Selectivity between disease and normal cells
|
||||
|
||||
### 6. Chemical Probes and Tool Compounds
|
||||
|
||||
High-quality small molecules for target validation.
|
||||
|
||||
#### Sources:
|
||||
|
||||
**Probes & Drugs Portal**
|
||||
- Chemical probes with characterized selectivity
|
||||
- Quality ratings and annotations
|
||||
- Target engagement data
|
||||
|
||||
**Structural Genomics Consortium (SGC)**
|
||||
- Target Enabling Packages (TEPs)
|
||||
- Comprehensive target reagents
|
||||
- Freely available to academia
|
||||
|
||||
**Probe Criteria:**
|
||||
- Potency (typically IC50 < 100 nM)
|
||||
- Selectivity (>30-fold vs. off-targets)
|
||||
- Cell activity demonstrated
|
||||
- Negative control available
|
||||
|
||||
### 7. Pharmacogenetics
|
||||
|
||||
Genetic variants affecting drug response for drugs targeting the gene.
|
||||
|
||||
#### Data Source: ClinPGx
|
||||
|
||||
**Information Included:**
|
||||
- Variant-drug pairs
|
||||
- Clinical annotations (dosing, efficacy, toxicity)
|
||||
- Evidence level and sources
|
||||
- PharmGKB cross-references
|
||||
|
||||
**Clinical Utility:**
|
||||
- Dosing adjustments based on genotype
|
||||
- Contraindications for specific variants
|
||||
- Efficacy predictors
|
||||
|
||||
### 8. Genetic Constraint
|
||||
|
||||
Measures of negative selection against variants in the gene.
|
||||
|
||||
#### Data Source: gnomAD
|
||||
|
||||
**Metrics:**
|
||||
|
||||
**pLI (probability of Loss-of-function Intolerance)**
|
||||
- Range: 0-1
|
||||
- pLI > 0.9 indicates intolerant to LoF variants
|
||||
- High pLI suggests essentiality
|
||||
|
||||
**LOEUF (Loss-of-function Observed/Expected Upper bound Fraction)**
|
||||
- Lower values indicate greater constraint
|
||||
- More interpretable than pLI across range
|
||||
|
||||
**Missense Constraint**
|
||||
- Z-scores for missense depletion
|
||||
- O/E ratios for missense variants
|
||||
|
||||
**Interpretation:**
|
||||
- High constraint suggests important biological function
|
||||
- May indicate safety concerns if inhibited
|
||||
- Essential genes often show high constraint
|
||||
|
||||
### 9. Comparative Genomics
|
||||
|
||||
Cross-species gene conservation and ortholog information.
|
||||
|
||||
#### Data Source: Ensembl Compara
|
||||
|
||||
**Ortholog Data:**
|
||||
- Mouse, rat, zebrafish, other model organisms
|
||||
- Orthology confidence (1:1, 1:many, many:many)
|
||||
- Percent identity and similarity
|
||||
|
||||
**Utility:**
|
||||
- Model organism studies transferability
|
||||
- Functional conservation assessment
|
||||
- Evolution and selective pressure
|
||||
|
||||
### 10. Cancer Annotations
|
||||
|
||||
Cancer-specific target features for oncology indications.
|
||||
|
||||
#### Data Sources:
|
||||
|
||||
**Cancer Gene Census**
|
||||
- Role in cancer (oncogene, TSG, fusion)
|
||||
- Tier classification (1 = established, 2 = emerging)
|
||||
- Tumor types and mutation types
|
||||
|
||||
**Cancer Hallmarks**
|
||||
- Functional roles in cancer biology
|
||||
- Hallmarks: proliferation, apoptosis evasion, metastasis, etc.
|
||||
- Links to specific cancer processes
|
||||
|
||||
**Oncology Clinical Trials**
|
||||
- Drugs in development targeting gene for cancer
|
||||
- Trial phases and indications
|
||||
|
||||
### 11. Mouse Phenotypes
|
||||
|
||||
Phenotypes from mouse knockout/mutation studies.
|
||||
|
||||
#### Data Source: MGI (Mouse Genome Informatics)
|
||||
|
||||
**Phenotype Data:**
|
||||
- Knockout phenotypes
|
||||
- Disease model associations
|
||||
- Mammalian Phenotype Ontology (MP) terms
|
||||
|
||||
**Utility:**
|
||||
- Predict on-target effects
|
||||
- Safety liability identification
|
||||
- Mechanism of action insights
|
||||
|
||||
### 12. Pathways
|
||||
|
||||
Biological pathway annotations placing target in functional context.
|
||||
|
||||
#### Data Source: Reactome
|
||||
|
||||
**Pathway Information:**
|
||||
- Curated biological pathways
|
||||
- Hierarchical organization
|
||||
- Pathway diagrams with target position
|
||||
|
||||
**Applications:**
|
||||
- Mechanism hypothesis generation
|
||||
- Related target identification
|
||||
- Systems biology analysis
|
||||
|
||||
## Using Target Annotations in Queries
|
||||
|
||||
### Query Template: Comprehensive Target Profile
|
||||
|
||||
```python
|
||||
query = """
|
||||
query targetProfile($ensemblId: String!) {
|
||||
target(ensemblId: $ensemblId) {
|
||||
id
|
||||
approvedSymbol
|
||||
approvedName
|
||||
biotype
|
||||
|
||||
# Tractability
|
||||
tractability {
|
||||
label
|
||||
modality
|
||||
value
|
||||
}
|
||||
|
||||
# Safety
|
||||
safetyLiabilities {
|
||||
event
|
||||
effects {
|
||||
dosing
|
||||
organsAffected
|
||||
}
|
||||
}
|
||||
|
||||
# Expression
|
||||
expressions {
|
||||
tissue {
|
||||
label
|
||||
}
|
||||
rna {
|
||||
value
|
||||
level
|
||||
}
|
||||
protein {
|
||||
level
|
||||
}
|
||||
}
|
||||
|
||||
# Chemical probes
|
||||
chemicalProbes {
|
||||
id
|
||||
probeminer
|
||||
origin
|
||||
}
|
||||
|
||||
# Known drugs
|
||||
knownDrugs {
|
||||
uniqueDrugs
|
||||
rows {
|
||||
drug {
|
||||
name
|
||||
maximumClinicalTrialPhase
|
||||
}
|
||||
phase
|
||||
status
|
||||
}
|
||||
}
|
||||
|
||||
# Genetic constraint
|
||||
geneticConstraint {
|
||||
constraintType
|
||||
score
|
||||
exp
|
||||
obs
|
||||
}
|
||||
|
||||
# Pathways
|
||||
pathways {
|
||||
pathway
|
||||
pathwayId
|
||||
}
|
||||
}
|
||||
}
|
||||
"""
|
||||
|
||||
variables = {"ensemblId": "ENSG00000157764"}
|
||||
```
|
||||
|
||||
## Annotation Interpretation Guidelines
|
||||
|
||||
### For Target Prioritization:
|
||||
|
||||
1. **Druggability (Tractability):**
|
||||
- Clinical precedence >> Discovery precedence > Predicted
|
||||
- Consider modality relevant to therapeutic approach
|
||||
- Check for existing tool compounds
|
||||
|
||||
2. **Safety Assessment:**
|
||||
- Review organ toxicity signals
|
||||
- Check expression in critical tissues
|
||||
- Assess genetic constraint (high = safety concern if inhibited)
|
||||
- Evaluate clinical adverse events from drugs
|
||||
|
||||
3. **Disease Relevance:**
|
||||
- Combine with association scores
|
||||
- Check expression in disease-relevant tissues
|
||||
- Review pathway context
|
||||
|
||||
4. **Validation Readiness:**
|
||||
- Chemical probes available?
|
||||
- Model organism data supportive?
|
||||
- Known drugs provide mechanism insight?
|
||||
|
||||
5. **Clinical Path Considerations:**
|
||||
- Pharmacogenetic factors
|
||||
- Expression pattern (tissue-specific is better for selectivity)
|
||||
- Essentiality (non-essential better for safety)
|
||||
|
||||
### Red Flags:
|
||||
|
||||
- **High essentiality + ubiquitous expression** - Poor therapeutic window
|
||||
- **Multiple safety liabilities** - Toxicity concerns
|
||||
- **High genetic constraint (pLI > 0.9)** - Critical gene, inhibition may be harmful
|
||||
- **No tractability precedence** - Higher risk, longer development
|
||||
- **Conflicting evidence** - Requires deeper investigation
|
||||
|
||||
### Green Flags:
|
||||
|
||||
- **Clinical precedence + related indication** - De-risked mechanism
|
||||
- **Tissue-specific expression** - Better selectivity
|
||||
- **Chemical probes available** - Faster validation
|
||||
- **Low essentiality + disease relevance** - Good therapeutic window
|
||||
- **Multiple evidence types converge** - Higher confidence
|
||||
Reference in New Issue
Block a user