Files
2025-11-29 18:02:37 +08:00

4.0 KiB

BUSCO-based Phylogenomics Skill

A Claude Code skills for phylogenomic analyses, created by Bruno de Medeiros (Field Museum) based on code initially written by Paul Frandsen (Brigham Young University)

It generate a complete phylogenetic workflow from genome assemblies using BUSCO/compleasm-based single-copy orthologs.

Features:

  • Supports local genome files and NCBI accessions (BioProjects/Assemblies)
  • Generates scheduler-specific scripts (SLURM, PBS, cloud, local)
  • Uses modern tools (compleasm, MAFFT, IQ-TREE, ASTRAL)
  • Multiple alignment trimming options
  • Both concatenation and coalescent approaches
  • Quality control with recommendations
  • Writes a draft methods paragraph describing the pipeline for publications

Use when you need to:

  • Build phylogenetic trees from multiple genome assemblies
  • Extract and align single-copy orthologs across genomes
  • Download genomes from NCBI by accession
  • Generate ready-to-run scripts for your computing environment

Installation

See README on the repository root folder for plugin installation.

Usage

Once installed, simply describe your phylogenomics task:

I need to generate a phylogeny from 20 genome assemblies on a SLURM cluster

Claude Code will automatically activate the appropriate skill and guide you through the workflow.

Workflow Overview

The complete phylogenomics pipeline:

  1. Input Preparation - Download NCBI genomes if needed
  2. Ortholog Identification - Run compleasm/BUSCO on all genomes
  3. Quality Control - Assess genome completeness with recommendations
  4. Ortholog Extraction - Generate per-locus unaligned FASTA files
  5. Alignment - Align orthologs with MAFFT
  6. Trimming - Remove poorly aligned regions (Aliscore/ALICUT, trimAl, BMGE, ClipKit)
  7. Concatenation - Build supermatrix with partition scheme
  8. Phylogenetic Inference - Generate ML concatenated tree (IQ-TREE), gene trees, and coalescent species tree (ASTRAL)

Requirements

Claude Code is better than the web interface, since Claude will then help you install all requirements.

The skill generates scripts that install and use:

  • compleasm or BUSCO - ortholog detection
  • MAFFT - multiple sequence alignment
  • Aliscore/ALICUT, trimAl, BMGE, or ClipKit - alignment trimming
  • FASconCAT - alignment concatenation
  • IQ-TREE - maximum likelihood phylogenetic inference
  • ASTRAL - coalescent species tree estimation
  • NCBI Datasets CLI - genome download (if using NCBI accessions)

Computing Environments

The skill supports multiple computing environments:

  • SLURM clusters - generates SBATCH array jobs
  • PBS/Torque clusters - generates PBS array jobs
  • Local machines - sequential execution scripts

Attribution

Created by Bruno de Medeiros (Curator of Pollinating Insects, Field Museum) based on phylogenomics tutorials by Paul Frandsen (Brigham Young University).

Citation

If you use this skill for published research, please cite this website and also:

  • compleasm: Huang, N., & Li, H. (2023). compleasm: a faster and more accurate reimplementation of BUSCO. Bioinformatics, 39(10), btad595.
  • MAFFT: Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7. Molecular Biology and Evolution, 30(4), 772-780.
  • IQ-TREE: Minh, B. Q., et al. (2020). IQ-TREE 2: New models and efficient methods for phylogenetic inference. Molecular Biology and Evolution, 37(5), 1530-1534.
  • ASTRAL: Zhang, C., et al. (2018). ASTRAL-III: polynomial time species tree reconstruction. BMC Bioinformatics, 19(6), 153.

Plus any trimming tool you use (Aliscore/ALICUT, trimAl, BMGE, or ClipKit).

License

MIT License - see individual tool licenses for software dependencies.

Support

For issues or questions:

Acknowledgments

Special thanks to Paul Frandsen (BYU) for creating the excellent phylogenomics tutorials that form the foundation of this skill.