4.0 KiB
BUSCO-based Phylogenomics Skill
A Claude Code skills for phylogenomic analyses, created by Bruno de Medeiros (Field Museum) based on code initially written by Paul Frandsen (Brigham Young University)
It generate a complete phylogenetic workflow from genome assemblies using BUSCO/compleasm-based single-copy orthologs.
Features:
- Supports local genome files and NCBI accessions (BioProjects/Assemblies)
- Generates scheduler-specific scripts (SLURM, PBS, cloud, local)
- Uses modern tools (compleasm, MAFFT, IQ-TREE, ASTRAL)
- Multiple alignment trimming options
- Both concatenation and coalescent approaches
- Quality control with recommendations
- Writes a draft methods paragraph describing the pipeline for publications
Use when you need to:
- Build phylogenetic trees from multiple genome assemblies
- Extract and align single-copy orthologs across genomes
- Download genomes from NCBI by accession
- Generate ready-to-run scripts for your computing environment
Installation
See README on the repository root folder for plugin installation.
Usage
Once installed, simply describe your phylogenomics task:
I need to generate a phylogeny from 20 genome assemblies on a SLURM cluster
Claude Code will automatically activate the appropriate skill and guide you through the workflow.
Workflow Overview
The complete phylogenomics pipeline:
- Input Preparation - Download NCBI genomes if needed
- Ortholog Identification - Run compleasm/BUSCO on all genomes
- Quality Control - Assess genome completeness with recommendations
- Ortholog Extraction - Generate per-locus unaligned FASTA files
- Alignment - Align orthologs with MAFFT
- Trimming - Remove poorly aligned regions (Aliscore/ALICUT, trimAl, BMGE, ClipKit)
- Concatenation - Build supermatrix with partition scheme
- Phylogenetic Inference - Generate ML concatenated tree (IQ-TREE), gene trees, and coalescent species tree (ASTRAL)
Requirements
Claude Code is better than the web interface, since Claude will then help you install all requirements.
The skill generates scripts that install and use:
- compleasm or BUSCO - ortholog detection
- MAFFT - multiple sequence alignment
- Aliscore/ALICUT, trimAl, BMGE, or ClipKit - alignment trimming
- FASconCAT - alignment concatenation
- IQ-TREE - maximum likelihood phylogenetic inference
- ASTRAL - coalescent species tree estimation
- NCBI Datasets CLI - genome download (if using NCBI accessions)
Computing Environments
The skill supports multiple computing environments:
- SLURM clusters - generates SBATCH array jobs
- PBS/Torque clusters - generates PBS array jobs
- Local machines - sequential execution scripts
Attribution
Created by Bruno de Medeiros (Curator of Pollinating Insects, Field Museum) based on phylogenomics tutorials by Paul Frandsen (Brigham Young University).
Citation
If you use this skill for published research, please cite this website and also:
- compleasm: Huang, N., & Li, H. (2023). compleasm: a faster and more accurate reimplementation of BUSCO. Bioinformatics, 39(10), btad595.
- MAFFT: Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7. Molecular Biology and Evolution, 30(4), 772-780.
- IQ-TREE: Minh, B. Q., et al. (2020). IQ-TREE 2: New models and efficient methods for phylogenetic inference. Molecular Biology and Evolution, 37(5), 1530-1534.
- ASTRAL: Zhang, C., et al. (2018). ASTRAL-III: polynomial time species tree reconstruction. BMC Bioinformatics, 19(6), 153.
Plus any trimming tool you use (Aliscore/ALICUT, trimAl, BMGE, or ClipKit).
License
MIT License - see individual tool licenses for software dependencies.
Support
For issues or questions:
- Open an issue in this repository
- Contact Bruno de Medeiros at the Field Museum (bdemedeiros@fieldmuseum.org)
Acknowledgments
Special thanks to Paul Frandsen (BYU) for creating the excellent phylogenomics tutorials that form the foundation of this skill.