Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:51:42 +08:00
commit 24486941f6
7 changed files with 682 additions and 0 deletions

View File

@@ -0,0 +1,203 @@
---
name: editing-obo-ontologies
description: Skills and tools for editing OBO format ontologies, including querying terms, checking out/checking in individual terms, and following OBO format conventions. Do not use this if the source for the ontology you are editing is not in obo format (e.g. ofn)
---
# OBO Ontology Editing Guide
This skill provides guidance and tools for editing ontologies in OBO format.
## Project Layout Conventions
Most OBO ontologies follow a similar structure:
- Main development file is typically `src/ontology/{ontology}-edit.obo`
- Individual terms can be checked out to `terms/` directory for editing
- Some projects may have different layouts - check the project's documentation
## Querying Ontology Terms
Use the `obo-grep.pl` script for searching OBO files:
- Look at a specific term by ID:
- `obo-grep.pl --noheader -r 'id: ONTO:0004177' src/ontology/{ontology}-edit.obo`
- All mentions of an ID:
- `obo-grep.pl --noheader -r 'ONTO:0004177' src/ontology/{ontology}-edit.obo`
- Search by regex (e.g., all mentions of hand or foot):
- `obo-grep.pl --noheader -r '(hand|foot)' src/ontology/{ontology}-edit.obo`
- Search is much faster than full file reads
- ONLY search the main edit file (usually `src/ontology/{ontology}-edit.obo`)
- DO NOT do manual greps or read entire files unless necessary
## Before Making Edits
- Read the request carefully and make a plan, especially if there is nuance
- If a PMID is mentioned, try to read it using: `aurelian fulltext PMID:NNNNNN`
- This also works for DOIs and URLs for scientific papers (if accessible)
- ALWAYS check proposed parent terms for consistency
- Check project-specific guidelines if available
## Editing Workflow
### IMPORTANT: Use Checkout/Checkin for Large Files
- Do not edit large ontology files directly
- Use the checkout/checkin workflow for individual terms
- Check out a term: `obo-checkout.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS]`
- This creates a single stanza file: `terms/{ontology}_1234567.obo` (note: colon replaced with underscore)
- Edit the small file in the `terms/` folder
- Check back in: `obo-checkin.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS]`
- Checking in updates the edit file and removes the file from `terms/`
- You can edit multiple terms in one batch file if needed
### Scripts Available
This skill includes three essential scripts:
1. `obo-grep.pl` - Fast searching of OBO files
2. `obo-checkout.pl` - Extract terms to individual files for editing
3. `obo-checkin.pl` - Merge edited terms back into main file
All scripts are available in your PATH when this skill is loaded.
## OBO Format Guidelines
### Basic Structure
- Term ID format: `ONTO:NNNNNNN` (check project conventions for number of digits)
- Each term requires:
- `id:` - unique identifier
- `name:` - human-readable label
- `namespace:` - ontology namespace
- `def:` - definition with references in square brackets
- Use standard relationship types: `is_a`, `part_of`, `has_part`, etc.
- Follow existing term patterns for consistency
### Handling New Term Requests (NTRs)
- Check project conventions for temporary ID ranges
- Example: Some projects use ranges like `ONTO:777xxxx` for new terms
- Always check for ID clashes: `grep 'id: ONTO:777' src/ontology/{ontology}-edit.obo`
- NEVER guess ontology IDs - use search tools to find actual terms
- NEVER guess PMIDs for references - do web searches if needed
### Citations and References
- Cite publications appropriately: `def: "..." [PMID:nnnn, doi:mmmm]`
- Fetch full text when needed: `aurelian fulltext <PMID:nnn>` (also works with DOIs and URLs)
- All synonyms should include proper citations
- Never use empty brackets `[]` without a source
### Synonyms
Synonyms should include proper attribution:
**Correct:**
```
synonym: "alternative name" EXACT [PMID:12345678]
synonym: "abbrev" EXACT ABBREVIATION [PMID:12345678]
```
### Relationships and Logical Definitions
- All terms should have at least one `is_a` parent
- Logical definitions follow genus-differentia form
- Text definitions should mirror logical definitions
- Include source attribution for relationships when based on literature:
### Logical Definitions (intersection_of)
Example of proper intersection_of usage:
```
[Term]
id: ONTO:0000715
name: specific disease
def: "A general disease that involves specific location." [PMID:12345678]
is_a: ONTO:0001082 ! general disease
intersection_of: ONTO:0004971 ! general disease
intersection_of: disease_has_location UBERON:0000029 ! specific location
```
Note that in OWL this corresponds to: `'specific disease' EquivalentTo 'general disease' and 'disease has location' some 'specific location'`
## Obsoleting Terms
- Obsolete terms should have NO logical axioms (`is_a`, `relationship`, `intersection_of`)
- Obsolete terms may have one `replaced_by` tag (exact replacement)
- Or multiple `consider` tags (suggested alternatives)
- Always include obsolescence reason and tracker reference
Example of simple obsolescence:
```
[Term]
id: ONTO:0100334
name: obsolete term name
property_value: IAO:0000231 OMO:0001000
property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI
is_obsolete: true
replaced_by: ONTO:0100321
```
Example with considerations instead of replacement:
```
[Term]
id: ONTO:0100229
name: obsolete term name
def: "OBSOLETE. Original definition here." [original references]
property_value: IAO:0000231 OMO:0001000
property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI
is_obsolete: true
consider: ONTO:0100259
consider: ONTO:0100260
```
### Important Notes on Obsolescence
- Synonyms and xrefs can be migrated to replacement terms judiciously
- Never do complete merges with `alt_id` - use obsolescence with replacement instead
- No relationships should point to an obsolete term
- When obsoleting, you may need to rewire other terms to "skip" the obsoleted term
## Metadata Best Practices
- Link to issue trackers: `property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI`
- Sign new terms (don't tag pre-existing terms):
```
property_value: http://purl.org/dc/terms/creator https://orcid.org/0000-0001-2345-6789
```
- All terms should have definitions with at least one reference (preferably PMID)
- Dates are typically auto-generated by build processes
## Syntax Checking
Validate OBO syntax using ROBOT:
```bash
robot convert --catalog src/ontology/catalog-v001.xml \
-i src/ontology/{ontology}-edit.obo \
-f obo \
-o {ontology}-edit.TMP.obo
```
Use `-vvv` flag for full stack trace if there are errors.
## Design Patterns
Many OBO ontologies use DOSDP (Dead Simple Ontology Design Patterns):
- Check `src/patterns/dosdp-patterns/*.yaml` for project-specific patterns
- Follow existing patterns when creating similar terms
- Common patterns include:
- Location-based disease patterns
- Gene-related disease patterns
- Part-of hierarchies
- Abnormality patterns
## Important Reminders
- NEVER guess identifiers of any kind
- If you include an identifier not provided by the user, you MUST verify it
- PMIDs can be checked with `aurelian` or web search
- Always follow project-specific conventions and check existing examples
- When in doubt, ask for clarification rather than making assumptions