Initial commit
This commit is contained in:
99
skills/phylo_from_buscos/README.md
Normal file
99
skills/phylo_from_buscos/README.md
Normal file
@@ -0,0 +1,99 @@
|
||||
# BUSCO-based Phylogenomics Skill
|
||||
|
||||
A Claude Code skills for phylogenomic analyses, created by Bruno de Medeiros (Field Museum) based on code initially written by Paul Frandsen (Brigham Young University)
|
||||
|
||||
It generate a complete phylogenetic workflow from genome assemblies using BUSCO/compleasm-based single-copy orthologs.
|
||||
|
||||
**Features:**
|
||||
- Supports local genome files and NCBI accessions (BioProjects/Assemblies)
|
||||
- Generates scheduler-specific scripts (SLURM, PBS, cloud, local)
|
||||
- Uses modern tools (compleasm, MAFFT, IQ-TREE, ASTRAL)
|
||||
- Multiple alignment trimming options
|
||||
- Both concatenation and coalescent approaches
|
||||
- Quality control with recommendations
|
||||
- Writes a draft methods paragraph describing the pipeline for publications
|
||||
|
||||
**Use when you need to:**
|
||||
- Build phylogenetic trees from multiple genome assemblies
|
||||
- Extract and align single-copy orthologs across genomes
|
||||
- Download genomes from NCBI by accession
|
||||
- Generate ready-to-run scripts for your computing environment
|
||||
|
||||
## Installation
|
||||
See README on the repository root folder for plugin installation.
|
||||
|
||||
|
||||
## Usage
|
||||
|
||||
Once installed, simply describe your phylogenomics task:
|
||||
|
||||
```
|
||||
I need to generate a phylogeny from 20 genome assemblies on a SLURM cluster
|
||||
```
|
||||
|
||||
Claude Code will automatically activate the appropriate skill and guide you through the workflow.
|
||||
|
||||
## Workflow Overview
|
||||
|
||||
The complete phylogenomics pipeline:
|
||||
|
||||
1. **Input Preparation** - Download NCBI genomes if needed
|
||||
2. **Ortholog Identification** - Run compleasm/BUSCO on all genomes
|
||||
3. **Quality Control** - Assess genome completeness with recommendations
|
||||
4. **Ortholog Extraction** - Generate per-locus unaligned FASTA files
|
||||
5. **Alignment** - Align orthologs with MAFFT
|
||||
6. **Trimming** - Remove poorly aligned regions (Aliscore/ALICUT, trimAl, BMGE, ClipKit)
|
||||
7. **Concatenation** - Build supermatrix with partition scheme
|
||||
8. **Phylogenetic Inference** - Generate ML concatenated tree (IQ-TREE), gene trees, and coalescent species tree (ASTRAL)
|
||||
|
||||
## Requirements
|
||||
|
||||
Claude Code is better than the web interface, since Claude will then help you install all requirements.
|
||||
|
||||
The skill generates scripts that install and use:
|
||||
|
||||
- **compleasm** or BUSCO - ortholog detection
|
||||
- **MAFFT** - multiple sequence alignment
|
||||
- **Aliscore/ALICUT, trimAl, BMGE, or ClipKit** - alignment trimming
|
||||
- **FASconCAT** - alignment concatenation
|
||||
- **IQ-TREE** - maximum likelihood phylogenetic inference
|
||||
- **ASTRAL** - coalescent species tree estimation
|
||||
- **NCBI Datasets CLI** - genome download (if using NCBI accessions)
|
||||
|
||||
|
||||
## Computing Environments
|
||||
|
||||
The skill supports multiple computing environments:
|
||||
|
||||
- **SLURM clusters** - generates SBATCH array jobs
|
||||
- **PBS/Torque clusters** - generates PBS array jobs
|
||||
- **Local machines** - sequential execution scripts
|
||||
|
||||
## Attribution
|
||||
|
||||
Created by **Bruno de Medeiros** (Curator of Pollinating Insects, Field Museum) based on phylogenomics tutorials by **Paul Frandsen** (Brigham Young University).
|
||||
|
||||
## Citation
|
||||
|
||||
If you use this skill for published research, please cite this website and also:
|
||||
|
||||
- **compleasm**: Huang, N., & Li, H. (2023). compleasm: a faster and more accurate reimplementation of BUSCO. *Bioinformatics*, 39(10), btad595.
|
||||
- **MAFFT**: Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7. *Molecular Biology and Evolution*, 30(4), 772-780.
|
||||
- **IQ-TREE**: Minh, B. Q., et al. (2020). IQ-TREE 2: New models and efficient methods for phylogenetic inference. *Molecular Biology and Evolution*, 37(5), 1530-1534.
|
||||
- **ASTRAL**: Zhang, C., et al. (2018). ASTRAL-III: polynomial time species tree reconstruction. *BMC Bioinformatics*, 19(6), 153.
|
||||
|
||||
Plus any trimming tool you use (Aliscore/ALICUT, trimAl, BMGE, or ClipKit).
|
||||
|
||||
## License
|
||||
|
||||
MIT License - see individual tool licenses for software dependencies.
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
- Open an issue in this repository
|
||||
- Contact Bruno de Medeiros at the Field Museum (bdemedeiros@fieldmuseum.org)
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
Special thanks to Paul Frandsen (BYU) for creating the excellent phylogenomics tutorials that form the foundation of this skill.
|
||||
Reference in New Issue
Block a user