Files
gh-ai4curation-curation-ski…/skills/ontology-access-kit/SKILL.md
2025-11-29 17:51:44 +08:00

106 lines
4.0 KiB
Markdown

---
name: ontology-access-kit
description: Skills for querying ontologies using the Ontology Access Kit (OAK). This should only be used for complex ontology operations, for basic external ontology searching use the OLS MCP
---
# OAK Guide
## Overview
OAK is a powerful command line library for accessing ontologies. It can be installed via:
- `uv add oaklib`
- `pip install oaklib`
The main command is `runoak`
## When to use
OAK is generally to be used for more complex operations.
- if you want to do basic search over external ontologies, you should favor the OLS MCP over OAK
- if you are working with local obo files, then hacky obo tools like `obo-grep.pl` may be better
## Adapters
You typically want to use the sqlite adapter. This gives you fast access to any ontology in OBO, plus a number of other commonly used ontologies, found in semantic-sql.
Example:
`runoak -i sqlite:obo:cl COMMAND COMMAND-OPTS ARGS`
Note the `-i` comes (*before*) the command-specific opts
You can also access any ontology in OLS or BioPortal:
- `runoak -i bioportal:snomedct relationships SNOMEDCT:128351009`
- `runoak -i bioportal:efo tree -p i EFO:0004200`
But some OAK commands may not be implemented.
With OLS or BioPortal you can also do searches over all ontologies:
- `runoak -i bioportal: info l~NovaSeq`
- `runoak -i ols: info l~NovaSeq`
To work with local obo files:
- `runoak -i impleobo:my_ont.obo info MY:1234 -O obo`
## Common Operations
You can find a list of all commands with `runoak --help`. oak is highly fully featured, and you are encouraged to
explore to find the functionality you need. We provide some examples below.
We use `info` for many examples, but note that many options and arguments work across different commands
* Lookup
* By exact label: `runoak -i sqlite:obo:cl info neuron` (returns `CL:0000540 ! neuron`)
* By exact label (multiple): `runoak -i sqlite:obo:uberon info finger toe`
* Search (any match): `runoak -i sqlite:obo:cl info 'l~T cell'`
* Search (starts with): `runoak -i sqlite:obo:cl info l^neuron`
* Fetching detailed info
* OBO format: `runoak -i sqlite:obo:cl info CL:0000540 -O obo`
* relationships: `runoak -i sqlite:obo:cl relationships --direction both CL:0000540`
* mappings: `runoak -i sqlite:obo:mondo mappings 'Marfan syndrome'`
* tree (is-a only): `runoak -i sqlite:obo:cl tree -p i CL:0000540`
* metadata: `runoak -i sqlite:obo:chebi term-metdata CHEBI:35235`
* Complex queries
* subclasses: `runoak -i sqlite:obo:cl info .sub CL:0000540 | head`
* disjunctions (OR): `runoak -i sqlite:obo:cl info .sub neuron .sub 'T cell' | tail`
* conjunctions: `runoak -i sqlite:obo:cl info .sub neuron .and .desc//p=i,p forebrain` (neurons and is-a/part-of the forebrain)
* minus: `runoak -i sqlite:obo:cl info .sub neuron .minus .desc//p=i,p forebrain` (neurons and NOT is-a/part-of the forebrain)
* Visualization
* `cl viz -p i,p,RO:0002215 'dopaminergic neuron' -o /tmp/dn.png` - subgraph from a CL term.
* note that graphviz requires installing og2dot
* Subsets
* list subsets: `runoak -i sqlite:obo:go subsets` - list all subsets (goslim_prokaryote etc)
* terms in subsets: `runoak -i sqlite:obo:go info .in goslim_generic` - all terms in a subset
* terms in subsets: `runoak -i sqlite:obo:go info .in goslim_generic .minus .in goslim_prokaryote` - all terms in a subset not in another
* Other
* `runoak lexmatch --help` for aligning ontologies
* `runoak statistics --help` for summary stats
## Common Options and Idioms
### Graphs
OAK is very graph oriented, following ontologies like GO, CL
Typically for graph operations you want to operate over only is-a and part-of, so use `-p i,p`
You can also specify RO/BFO ids.
E.g.
```bash
runoak -i sqlite:obo:ro info 'capable of'
RO:0002215 ! capable of
```
```bash
cl relationships -p RO:0002215 'dopaminergic neuron'
subject predicate object subject_label predicate_label object_label
CL:0000700 RO:0002215 GO:0061527 dopaminergic neuron capable of dopamine secretion, neurotransmission
```