Files
gh-brunoasm-my-claude-skill…/skills/phylo_from_buscos/README.md
2025-11-29 18:02:37 +08:00

100 lines
4.0 KiB
Markdown

# BUSCO-based Phylogenomics Skill
A Claude Code skills for phylogenomic analyses, created by Bruno de Medeiros (Field Museum) based on code initially written by Paul Frandsen (Brigham Young University)
It generate a complete phylogenetic workflow from genome assemblies using BUSCO/compleasm-based single-copy orthologs.
**Features:**
- Supports local genome files and NCBI accessions (BioProjects/Assemblies)
- Generates scheduler-specific scripts (SLURM, PBS, cloud, local)
- Uses modern tools (compleasm, MAFFT, IQ-TREE, ASTRAL)
- Multiple alignment trimming options
- Both concatenation and coalescent approaches
- Quality control with recommendations
- Writes a draft methods paragraph describing the pipeline for publications
**Use when you need to:**
- Build phylogenetic trees from multiple genome assemblies
- Extract and align single-copy orthologs across genomes
- Download genomes from NCBI by accession
- Generate ready-to-run scripts for your computing environment
## Installation
See README on the repository root folder for plugin installation.
## Usage
Once installed, simply describe your phylogenomics task:
```
I need to generate a phylogeny from 20 genome assemblies on a SLURM cluster
```
Claude Code will automatically activate the appropriate skill and guide you through the workflow.
## Workflow Overview
The complete phylogenomics pipeline:
1. **Input Preparation** - Download NCBI genomes if needed
2. **Ortholog Identification** - Run compleasm/BUSCO on all genomes
3. **Quality Control** - Assess genome completeness with recommendations
4. **Ortholog Extraction** - Generate per-locus unaligned FASTA files
5. **Alignment** - Align orthologs with MAFFT
6. **Trimming** - Remove poorly aligned regions (Aliscore/ALICUT, trimAl, BMGE, ClipKit)
7. **Concatenation** - Build supermatrix with partition scheme
8. **Phylogenetic Inference** - Generate ML concatenated tree (IQ-TREE), gene trees, and coalescent species tree (ASTRAL)
## Requirements
Claude Code is better than the web interface, since Claude will then help you install all requirements.
The skill generates scripts that install and use:
- **compleasm** or BUSCO - ortholog detection
- **MAFFT** - multiple sequence alignment
- **Aliscore/ALICUT, trimAl, BMGE, or ClipKit** - alignment trimming
- **FASconCAT** - alignment concatenation
- **IQ-TREE** - maximum likelihood phylogenetic inference
- **ASTRAL** - coalescent species tree estimation
- **NCBI Datasets CLI** - genome download (if using NCBI accessions)
## Computing Environments
The skill supports multiple computing environments:
- **SLURM clusters** - generates SBATCH array jobs
- **PBS/Torque clusters** - generates PBS array jobs
- **Local machines** - sequential execution scripts
## Attribution
Created by **Bruno de Medeiros** (Curator of Pollinating Insects, Field Museum) based on phylogenomics tutorials by **Paul Frandsen** (Brigham Young University).
## Citation
If you use this skill for published research, please cite this website and also:
- **compleasm**: Huang, N., & Li, H. (2023). compleasm: a faster and more accurate reimplementation of BUSCO. *Bioinformatics*, 39(10), btad595.
- **MAFFT**: Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7. *Molecular Biology and Evolution*, 30(4), 772-780.
- **IQ-TREE**: Minh, B. Q., et al. (2020). IQ-TREE 2: New models and efficient methods for phylogenetic inference. *Molecular Biology and Evolution*, 37(5), 1530-1534.
- **ASTRAL**: Zhang, C., et al. (2018). ASTRAL-III: polynomial time species tree reconstruction. *BMC Bioinformatics*, 19(6), 153.
Plus any trimming tool you use (Aliscore/ALICUT, trimAl, BMGE, or ClipKit).
## License
MIT License - see individual tool licenses for software dependencies.
## Support
For issues or questions:
- Open an issue in this repository
- Contact Bruno de Medeiros at the Field Museum (bdemedeiros@fieldmuseum.org)
## Acknowledgments
Special thanks to Paul Frandsen (BYU) for creating the excellent phylogenomics tutorials that form the foundation of this skill.