Initial commit

2025-11-30 08:30:10 +08:00
commit f0bd18fb4e
824 changed files with 331919 additions and 0 deletions
--- a/skills/opentargets-database/references/api_reference.md
+++ b/skills/opentargets-database/references/api_reference.md
@@ -0,0 +1,249 @@
+# Open Targets Platform API Reference
+
+## API Endpoint
+
+```
+https://api.platform.opentargets.org/api/v4/graphql
+```
+
+Interactive GraphQL playground with documentation:
+```
+https://api.platform.opentargets.org/api/v4/graphql/browser
+```
+
+## Access Methods
+
+The Open Targets Platform provides multiple access methods:
+
+1. **GraphQL API** - Best for single entity queries and flexible data retrieval
+2. **Web Interface** - Interactive platform at https://platform.opentargets.org
+3. **Data Downloads** - FTP at https://ftp.ebi.ac.uk/pub/databases/opentargets/platform/
+4. **Google BigQuery** - For large-scale systematic queries
+
+## Authentication
+
+No authentication is required for the GraphQL API. All data is freely accessible.
+
+## Rate Limits
+
+For systematic queries involving multiple targets or diseases, use dataset downloads or BigQuery instead of repeated API calls. The API is optimized for single-entity and exploratory queries.
+
+## GraphQL Query Structure
+
+GraphQL queries consist of:
+1. Query operation with optional variables
+2. Field selection (request only needed fields)
+3. Nested entity traversal
+
+### Basic Python Example
+
+```python
+import requests
+import json
+
+# Define the query
+query_string = """
+  query target($ensemblId: String!){
+    target(ensemblId: $ensemblId){
+      id
+      approvedSymbol
+      biotype
+      geneticConstraint {
+        constraintType
+        exp
+        obs
+        score
+      }
+    }
+  }
+"""
+
+# Define variables
+variables = {"ensemblId": "ENSG00000169083"}
+
+# Make the request
+base_url = "https://api.platform.opentargets.org/api/v4/graphql"
+response = requests.post(base_url, json={"query": query_string, "variables": variables})
+data = json.loads(response.text)
+print(data)
+```
+
+## Available Query Endpoints
+
+### /target
+Retrieve gene annotations, tractability assessments, and disease associations.
+
+**Common fields:**
+- `id` - Ensembl gene ID
+- `approvedSymbol` - HGNC gene symbol
+- `approvedName` - Full gene name
+- `biotype` - Gene type (protein_coding, etc.)
+- `tractability` - Druggability assessment
+- `safetyLiabilities` - Safety information
+- `expressions` - Baseline expression data
+- `knownDrugs` - Approved/clinical drugs
+- `associatedDiseases` - Disease associations with evidence
+
+### /disease
+Retrieve disease/phenotype data, known drugs, and clinical information.
+
+**Common fields:**
+- `id` - EFO disease identifier
+- `name` - Disease name
+- `description` - Disease description
+- `therapeuticAreas` - High-level disease categories
+- `synonyms` - Alternative names
+- `knownDrugs` - Drugs indicated for disease
+- `associatedTargets` - Target associations with evidence
+
+### /drug
+Retrieve compound details, mechanisms of action, and pharmacovigilance data.
+
+**Common fields:**
+- `id` - ChEMBL identifier
+- `name` - Drug name
+- `drugType` - Small molecule, antibody, etc.
+- `maximumClinicalTrialPhase` - Development stage
+- `indications` - Disease indications
+- `mechanismsOfAction` - Target mechanisms
+- `adverseEvents` - Pharmacovigilance data
+
+### /search
+Search across all entities (targets, diseases, drugs).
+
+**Parameters:**
+- `queryString` - Search term
+- `entityNames` - Filter by entity type(s)
+- `page` - Pagination
+
+### /associationDiseaseIndirect
+Retrieve target-disease associations including indirect evidence from disease descendants in ontology.
+
+**Key fields:**
+- `rows` - Association records with scores
+- `aggregations` - Aggregated statistics
+
+## Example Queries
+
+### Query 1: Get target information with disease associations
+
+```python
+query = """
+  query targetInfo($ensemblId: String!) {
+    target(ensemblId: $ensemblId) {
+      approvedSymbol
+      approvedName
+      tractability {
+        label
+        modality
+        value
+      }
+      associatedDiseases(page: {size: 10}) {
+        rows {
+          disease {
+            name
+          }
+          score
+          datatypeScores {
+            componentId
+            score
+          }
+        }
+      }
+    }
+  }
+"""
+variables = {"ensemblId": "ENSG00000157764"}
+```
+
+### Query 2: Search for diseases
+
+```python
+query = """
+  query searchDiseases($queryString: String!) {
+    search(queryString: $queryString, entityNames: ["disease"]) {
+      hits {
+        id
+        entity
+        name
+        description
+      }
+    }
+  }
+"""
+variables = {"queryString": "alzheimer"}
+```
+
+### Query 3: Get evidence for target-disease pair
+
+```python
+query = """
+  query evidences($ensemblId: String!, $efoId: String!) {
+    disease(efoId: $efoId) {
+      evidences(ensemblIds: [$ensemblId], size: 100) {
+        rows {
+          datasourceId
+          datatypeId
+          score
+          studyId
+          literature
+        }
+      }
+    }
+  }
+"""
+variables = {"ensemblId": "ENSG00000157764", "efoId": "EFO_0000249"}
+```
+
+### Query 4: Get known drugs for a disease
+
+```python
+query = """
+  query knownDrugs($efoId: String!) {
+    disease(efoId: $efoId) {
+      knownDrugs {
+        uniqueDrugs
+        rows {
+          drug {
+            name
+            id
+          }
+          targets {
+            approvedSymbol
+          }
+          phase
+          status
+        }
+      }
+    }
+  }
+"""
+variables = {"efoId": "EFO_0000249"}
+```
+
+## Error Handling
+
+GraphQL returns status code 200 even for errors. Check the response structure:
+
+```python
+if 'errors' in response_data:
+    print(f"GraphQL errors: {response_data['errors']}")
+else:
+    print(f"Data: {response_data['data']}")
+```
+
+## Best Practices
+
+1. **Request only needed fields** - Minimize data transfer and improve response time
+2. **Use variables** - Make queries reusable and safer
+3. **Handle pagination** - Most list fields support pagination with `page: {size: N, index: M}`
+4. **Explore the schema** - Use the GraphQL browser to discover available fields
+5. **Batch related queries** - Combine multiple entity fetches in a single query when possible
+6. **Cache results** - Store frequently accessed data locally to reduce API calls
+7. **Use BigQuery for bulk** - Switch to BigQuery/downloads for systematic analyses
+
+## Data Licensing
+
+All Open Targets Platform data is freely available. When using the data in research or commercial products, cite the latest publication:
+
+Ochoa, D. et al. (2025) Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery. Nucleic Acids Research, 53(D1):D1467-D1477.
--- a/skills/opentargets-database/references/evidence_types.md
+++ b/skills/opentargets-database/references/evidence_types.md
@@ -0,0 +1,306 @@
+# Evidence Types and Data Sources
+
+## Overview
+
+Evidence represents any event or set of events that identifies a target as a potential causal gene or protein for a disease. Evidence is standardized and mapped to:
+- **Ensembl gene IDs** for targets
+- **EFO (Experimental Factor Ontology)** for diseases/phenotypes
+
+Evidence is organized into **data types** (broader categories) and **data sources** (specific databases/studies).
+
+## Evidence Data Types
+
+### 1. Genetic Association
+
+Evidence from human genetics linking genetic variants to disease phenotypes.
+
+#### Data Sources:
+
+**GWAS (Genome-Wide Association Studies)**
+- Population-level common variant associations
+- Filtered with Locus-to-Gene (L2G) scores >0.05
+- Includes fine-mapping and colocalization data
+- Sources: GWAS Catalog, FinnGen, UK Biobank, EBI GWAS
+
+**Gene Burden Tests**
+- Rare variant association analyses
+- Aggregate effects of multiple rare variants in a gene
+- Particularly relevant for Mendelian and rare diseases
+
+**ClinVar Germline**
+- Clinical variant interpretations
+- Classifications: pathogenic, likely pathogenic, VUS, benign
+- Expert-reviewed variant-disease associations
+
+**Genomics England PanelApp**
+- Expert gene-disease ratings
+- Green (confirmed), amber (probable), red (no evidence)
+- Focus on rare diseases and cancer
+
+**Gene2Phenotype**
+- Curated gene-disease relationships
+- Allelic requirements and inheritance patterns
+- Clinical validity assessments
+
+**UniProt Literature & Variants**
+- Literature-based gene-disease associations
+- Expert-curated from scientific publications
+
+**Orphanet**
+- Rare disease gene associations
+- Expert-reviewed and maintained
+
+**ClinGen**
+- Clinical genome resource classifications
+- Gene-disease validity assertions
+
+### 2. Somatic Mutations
+
+Evidence from cancer genomics identifying driver genes and therapeutic targets.
+
+#### Data Sources:
+
+**Cancer Gene Census**
+- Expert-curated cancer genes
+- Tier classifications (1 = strong evidence, 2 = emerging)
+- Mutation types and cancer types
+
+**IntOGen**
+- Computational driver gene predictions
+- Aggregated from large cohort studies
+- Statistical significance of mutations
+
+**ClinVar Somatic**
+- Somatic clinical variant interpretations
+- Oncogenic/likely oncogenic classifications
+
+**Cancer Biomarkers**
+- FDA/EMA approved biomarkers
+- Clinical trial biomarkers
+- Prognostic and predictive markers
+
+### 3. Known Drugs
+
+Evidence from clinical precedence showing drugs targeting genes for disease indications.
+
+#### Data Source:
+
+**ChEMBL**
+- Approved drugs (Phase 4)
+- Clinical candidates (Phase 1-3)
+- Withdrawn drugs
+- Drug-target-indication triplets with mechanism of action
+
+**Clinical Trial Information:**
+- `phase`: Maximum clinical trial phase (1, 2, 3, 4)
+- `status`: Active, terminated, completed, withdrawn
+- `mechanismOfAction`: How drug affects target
+
+### 4. Affected Pathways
+
+Evidence linking genes to disease through pathway perturbations and functional screens.
+
+#### Data Sources:
+
+**CRISPR Screens**
+- Genome-scale knockout screens
+- Cancer dependency and essentiality data
+
+**Project Score (Cancer Dependency Map)**
+- CRISPR-Cas9 fitness screens across cancer cell lines
+- Gene essentiality profiles
+
+**SLAPenrich**
+- Pathway enrichment analysis
+- Somatic mutation pathway impacts
+
+**PROGENy**
+- Pathway activity inference
+- Signaling pathway perturbations
+
+**Reactome**
+- Expert-curated pathway annotations
+- Biological pathway representations
+
+**Gene Signatures**
+- Expression-based signatures
+- Pathway activity patterns
+
+### 5. RNA Expression
+
+Evidence from differential gene expression in disease vs. control tissues.
+
+#### Data Source:
+
+**Expression Atlas**
+- Differential expression data
+- Baseline expression across tissues/conditions
+- RNA-Seq and microarray studies
+- Log2 fold-change and p-values
+
+### 6. Animal Models
+
+Evidence from in vivo studies showing phenotypes associated with gene perturbations.
+
+#### Data Source:
+
+**IMPC (International Mouse Phenotyping Consortium)**
+- Systematic mouse knockout phenotypes
+- Phenotype-disease mappings via ontologies
+- Standardized phenotyping procedures
+
+### 7. Literature
+
+Evidence from text-mining of biomedical literature.
+
+#### Data Source:
+
+**Europe PMC**
+- Co-occurrence of genes and diseases in abstracts
+- Normalized citation counts
+- Weighted by publication type and recency
+
+## Evidence Scoring
+
+Each evidence source has its own scoring methodology:
+
+### Score Ranges
+- Most scores normalized to 0-1 range
+- Higher scores indicate stronger evidence
+- Scores are NOT confidence levels but relative strength indicators
+
+### Common Scoring Approaches:
+
+**Binary Classifications:**
+- ClinVar: Pathogenic (1.0), Likely pathogenic (0.99), etc.
+- Gene2Phenotype: Confirmed/probable ratings
+- PanelApp: Green/amber/red classifications
+
+**Statistical Measures:**
+- GWAS: L2G scores incorporating multiple lines of evidence
+- Gene Burden: Statistical significance of variant aggregation
+- Expression: Adjusted p-values and fold-changes
+
+**Clinical Precedence:**
+- Known Drugs: Phase weights (Phase 4 = 1.0, Phase 3 = 0.8, etc.)
+- Clinical status modifiers
+
+**Computational Predictions:**
+- IntOGen: Q-values from driver mutation analysis
+- PROGENy/SLAPenrich: Pathway activity/enrichment scores
+
+## Evidence Interpretation Guidelines
+
+### Strengths by Data Type
+
+**Genetic Association** - Strongest human genetic evidence
+- Direct link between genetic variation and disease
+- Mendelian diseases: high confidence
+- GWAS: requires L2G to identify causal gene
+- Consider ancestry and population-specific effects
+
+**Somatic Mutations** - Direct evidence in cancer
+- Strong for oncology indications
+- Driver mutations indicate therapeutic potential
+- Consider cancer type specificity
+
+**Known Drugs** - Clinical validation
+- Highest confidence: approved drugs (Phase 4)
+- Consider mechanism relevance to new indication
+- Phase 1-2: early evidence, higher risk
+
+**Affected Pathways** - Mechanistic insights
+- Supports biological plausibility
+- May not predict clinical success
+- Useful for hypothesis generation
+
+**RNA Expression** - Observational evidence
+- Correlation, not causation
+- May reflect disease consequence vs. cause
+- Useful for biomarker identification
+
+**Animal Models** - Translational evidence
+- Strong for understanding biology
+- Variable translation to human disease
+- Most useful when phenotype matches human disease
+
+**Literature** - Exploratory signal
+- Text-mining captures research focus
+- May reflect publication bias
+- Requires manual literature review for validation
+
+### Important Considerations
+
+1. **Multiple evidence types strengthen confidence** - Convergent evidence from different data types provides stronger support
+
+2. **Under-studied diseases score lower** - Novel or rare diseases may have strong evidence but lower aggregate scores due to limited research
+
+3. **Association scores are not probabilities** - Scores rank relative evidence strength, not success probability
+
+4. **Context matters** - Evidence strength depends on:
+   - Disease mechanism understanding
+   - Target biology and druggability
+   - Clinical precedence in related indications
+   - Safety considerations
+
+5. **Data source reliability varies** - Weight expert-curated sources (ClinGen, Gene2Phenotype) higher than computational predictions
+
+## Using Evidence in Queries
+
+### Filtering by Data Type
+
+```python
+query = """
+  query evidenceByType($ensemblId: String!, $efoId: String!, $dataTypes: [String!]) {
+    disease(efoId: $efoId) {
+      evidences(ensemblIds: [$ensemblId], datatypes: $dataTypes) {
+        rows {
+          datasourceId
+          score
+        }
+      }
+    }
+  }
+"""
+variables = {
+    "ensemblId": "ENSG00000157764",
+    "efoId": "EFO_0000249",
+    "dataTypes": ["genetic_association", "somatic_mutation"]
+}
+```
+
+### Accessing Data Type Scores
+
+Data type scores aggregate all source scores within that type:
+
+```python
+query = """
+  query associationScores($ensemblId: String!, $efoId: String!) {
+    target(ensemblId: $ensemblId) {
+      associatedDiseases(efoIds: [$efoId]) {
+        rows {
+          disease {
+            name
+          }
+          score
+          datatypeScores {
+            componentId
+            score
+          }
+        }
+      }
+    }
+  }
+"""
+```
+
+## Evidence Quality Assessment
+
+When evaluating evidence:
+
+1. **Check multiple sources** - Single source may be unreliable
+2. **Prioritize human genetic evidence** - Strongest disease relevance
+3. **Consider clinical precedence** - Known drugs indicate druggability
+4. **Assess mechanistic support** - Pathway evidence supports biology
+5. **Review literature manually** - For critical decisions, read primary publications
+6. **Validate in primary databases** - Cross-reference with ClinVar, ClinGen, etc.
--- a/skills/opentargets-database/references/target_annotations.md
+++ b/skills/opentargets-database/references/target_annotations.md
@@ -0,0 +1,401 @@
+# Target Annotations and Features
+
+## Overview
+
+Open Targets defines a target as "any naturally-occurring molecule that can be targeted by a medicinal product." Targets are primarily protein-coding genes identified by Ensembl gene IDs, but also include RNAs and pseudogenes from canonical chromosomes.
+
+## Core Target Annotations
+
+### 1. Tractability Assessment
+
+Tractability evaluates the druggability potential of a target across different modalities.
+
+#### Modalities Assessed:
+
+**Small Molecule**
+- Prediction of small molecule druggability
+- Based on structural features, chemical precedence
+- Buckets: Clinical precedence, Discovery precedence, Predicted tractable
+
+**Antibody**
+- Likelihood of antibody-based therapeutic success
+- Cell surface/secreted protein location
+- Precedence categories similar to small molecules
+
+**PROTAC (Protein Degradation)**
+- Assessment for targeted protein degradation
+- E3 ligase compatibility
+- Emerging modality category
+
+**Other Modalities**
+- Gene therapy, RNA-based therapeutics
+- Oligonucleotide approaches
+
+#### Tractability Levels:
+
+1. **Clinical Precedence** - Target of approved/clinical drug with similar mechanism
+2. **Discovery Precedence** - Target of tool compounds or compounds in preclinical development
+3. **Predicted Tractable** - Computational predictions suggest druggability
+4. **Unknown** - Insufficient data to assess
+
+### 2. Safety Liabilities
+
+Safety information aggregated from multiple sources to identify potential toxicity concerns.
+
+#### Data Sources:
+
+**ToxCast**
+- High-throughput toxicology screening data
+- In vitro assay results
+- Toxicity pathway activation
+
+**AOPWiki (Adverse Outcome Pathways)**
+- Mechanistic pathways from molecular initiating event to adverse outcome
+- Systems toxicology frameworks
+
+**PharmGKB**
+- Pharmacogenomic relationships
+- Genetic variants affecting drug response and toxicity
+
+**Published Literature**
+- Expert-curated safety concerns from publications
+- Clinical trial adverse events
+
+#### Safety Flags:
+
+- **Organ toxicity** - Liver, kidney, cardiac effects
+- **Target safety liability** - Known on-target toxic effects
+- **Off-target effects** - Unintended activity concerns
+- **Clinical observations** - Adverse events from drugs targeting gene
+
+### 3. Baseline Expression
+
+Gene/protein expression across tissues and cell types from multiple sources.
+
+#### Data Sources:
+
+**Expression Atlas**
+- RNA-Seq expression across tissues/conditions
+- Normalized expression levels (TPM, FPKM)
+- Differential expression studies
+
+**GTEx (Genotype-Tissue Expression)**
+- Comprehensive tissue expression from healthy donors
+- Median TPM across 53 tissues
+- Expression variation analysis
+
+**Human Protein Atlas**
+- Protein expression via immunohistochemistry
+- Subcellular localization
+- Tissue specificity classifications
+
+#### Expression Metrics:
+
+- **TPM (Transcripts Per Million)** - Normalized RNA abundance
+- **Tissue specificity** - Enrichment in specific tissues
+- **Protein level** - Correlation with RNA expression
+- **Subcellular location** - Where protein is found in cell
+
+### 4. Molecular Interactions
+
+Protein-protein interactions, complex memberships, and molecular partnerships.
+
+#### Interaction Types:
+
+**Physical Interactions**
+- Direct protein-protein binding
+- Complex components
+- Sources: IntAct, BioGRID, STRING
+
+**Pathway Membership**
+- Biological pathways from Reactome
+- Functional relationships
+- Upstream/downstream regulators
+
+**Target Interactors**
+- Direct interactors relevant to disease associations
+- Context-specific interactions
+
+### 5. Gene Essentiality
+
+Dependency data indicating if gene is essential for cell survival.
+
+#### Data Sources:
+
+**Project Score**
+- CRISPR-Cas9 fitness screens
+- 300+ cancer cell lines
+- Scaled essentiality scores (0-1)
+
+**DepMap Portal**
+- Large-scale cancer dependency data
+- Genetic and pharmacological perturbations
+- Common essential genes identification
+
+#### Essentiality Metrics:
+
+- **Score range**: 0 (non-essential) to 1 (essential)
+- **Context**: Cell line specific vs. pan-essential
+- **Therapeutic window**: Selectivity between disease and normal cells
+
+### 6. Chemical Probes and Tool Compounds
+
+High-quality small molecules for target validation.
+
+#### Sources:
+
+**Probes & Drugs Portal**
+- Chemical probes with characterized selectivity
+- Quality ratings and annotations
+- Target engagement data
+
+**Structural Genomics Consortium (SGC)**
+- Target Enabling Packages (TEPs)
+- Comprehensive target reagents
+- Freely available to academia
+
+**Probe Criteria:**
+- Potency (typically IC50 < 100 nM)
+- Selectivity (>30-fold vs. off-targets)
+- Cell activity demonstrated
+- Negative control available
+
+### 7. Pharmacogenetics
+
+Genetic variants affecting drug response for drugs targeting the gene.
+
+#### Data Source: ClinPGx
+
+**Information Included:**
+- Variant-drug pairs
+- Clinical annotations (dosing, efficacy, toxicity)
+- Evidence level and sources
+- PharmGKB cross-references
+
+**Clinical Utility:**
+- Dosing adjustments based on genotype
+- Contraindications for specific variants
+- Efficacy predictors
+
+### 8. Genetic Constraint
+
+Measures of negative selection against variants in the gene.
+
+#### Data Source: gnomAD
+
+**Metrics:**
+
+**pLI (probability of Loss-of-function Intolerance)**
+- Range: 0-1
+- pLI > 0.9 indicates intolerant to LoF variants
+- High pLI suggests essentiality
+
+**LOEUF (Loss-of-function Observed/Expected Upper bound Fraction)**
+- Lower values indicate greater constraint
+- More interpretable than pLI across range
+
+**Missense Constraint**
+- Z-scores for missense depletion
+- O/E ratios for missense variants
+
+**Interpretation:**
+- High constraint suggests important biological function
+- May indicate safety concerns if inhibited
+- Essential genes often show high constraint
+
+### 9. Comparative Genomics
+
+Cross-species gene conservation and ortholog information.
+
+#### Data Source: Ensembl Compara
+
+**Ortholog Data:**
+- Mouse, rat, zebrafish, other model organisms
+- Orthology confidence (1:1, 1:many, many:many)
+- Percent identity and similarity
+
+**Utility:**
+- Model organism studies transferability
+- Functional conservation assessment
+- Evolution and selective pressure
+
+### 10. Cancer Annotations
+
+Cancer-specific target features for oncology indications.
+
+#### Data Sources:
+
+**Cancer Gene Census**
+- Role in cancer (oncogene, TSG, fusion)
+- Tier classification (1 = established, 2 = emerging)
+- Tumor types and mutation types
+
+**Cancer Hallmarks**
+- Functional roles in cancer biology
+- Hallmarks: proliferation, apoptosis evasion, metastasis, etc.
+- Links to specific cancer processes
+
+**Oncology Clinical Trials**
+- Drugs in development targeting gene for cancer
+- Trial phases and indications
+
+### 11. Mouse Phenotypes
+
+Phenotypes from mouse knockout/mutation studies.
+
+#### Data Source: MGI (Mouse Genome Informatics)
+
+**Phenotype Data:**
+- Knockout phenotypes
+- Disease model associations
+- Mammalian Phenotype Ontology (MP) terms
+
+**Utility:**
+- Predict on-target effects
+- Safety liability identification
+- Mechanism of action insights
+
+### 12. Pathways
+
+Biological pathway annotations placing target in functional context.
+
+#### Data Source: Reactome
+
+**Pathway Information:**
+- Curated biological pathways
+- Hierarchical organization
+- Pathway diagrams with target position
+
+**Applications:**
+- Mechanism hypothesis generation
+- Related target identification
+- Systems biology analysis
+
+## Using Target Annotations in Queries
+
+### Query Template: Comprehensive Target Profile
+
+```python
+query = """
+  query targetProfile($ensemblId: String!) {
+    target(ensemblId: $ensemblId) {
+      id
+      approvedSymbol
+      approvedName
+      biotype
+
+      # Tractability
+      tractability {
+        label
+        modality
+        value
+      }
+
+      # Safety
+      safetyLiabilities {
+        event
+        effects {
+          dosing
+          organsAffected
+        }
+      }
+
+      # Expression
+      expressions {
+        tissue {
+          label
+        }
+        rna {
+          value
+          level
+        }
+        protein {
+          level
+        }
+      }
+
+      # Chemical probes
+      chemicalProbes {
+        id
+        probeminer
+        origin
+      }
+
+      # Known drugs
+      knownDrugs {
+        uniqueDrugs
+        rows {
+          drug {
+            name
+            maximumClinicalTrialPhase
+          }
+          phase
+          status
+        }
+      }
+
+      # Genetic constraint
+      geneticConstraint {
+        constraintType
+        score
+        exp
+        obs
+      }
+
+      # Pathways
+      pathways {
+        pathway
+        pathwayId
+      }
+    }
+  }
+"""
+
+variables = {"ensemblId": "ENSG00000157764"}
+```
+
+## Annotation Interpretation Guidelines
+
+### For Target Prioritization:
+
+1. **Druggability (Tractability):**
+   - Clinical precedence >> Discovery precedence > Predicted
+   - Consider modality relevant to therapeutic approach
+   - Check for existing tool compounds
+
+2. **Safety Assessment:**
+   - Review organ toxicity signals
+   - Check expression in critical tissues
+   - Assess genetic constraint (high = safety concern if inhibited)
+   - Evaluate clinical adverse events from drugs
+
+3. **Disease Relevance:**
+   - Combine with association scores
+   - Check expression in disease-relevant tissues
+   - Review pathway context
+
+4. **Validation Readiness:**
+   - Chemical probes available?
+   - Model organism data supportive?
+   - Known drugs provide mechanism insight?
+
+5. **Clinical Path Considerations:**
+   - Pharmacogenetic factors
+   - Expression pattern (tissue-specific is better for selectivity)
+   - Essentiality (non-essential better for safety)
+
+### Red Flags:
+
+- **High essentiality + ubiquitous expression** - Poor therapeutic window
+- **Multiple safety liabilities** - Toxicity concerns
+- **High genetic constraint (pLI > 0.9)** - Critical gene, inhibition may be harmful
+- **No tractability precedence** - Higher risk, longer development
+- **Conflicting evidence** - Requires deeper investigation
+
+### Green Flags:
+
+- **Clinical precedence + related indication** - De-risked mechanism
+- **Tissue-specific expression** - Better selectivity
+- **Chemical probes available** - Faster validation
+- **Low essentiality + disease relevance** - Good therapeutic window
+- **Multiple evidence types converge** - Higher confidence