Files
gh-k-dense-ai-claude-scient…/skills/pymatgen/references/transformations_workflows.md
2025-11-30 08:30:10 +08:00

16 KiB

Pymatgen Transformations and Common Workflows

This reference documents pymatgen's transformation framework and provides recipes for common materials science workflows.

Transformation Framework

Transformations provide a systematic way to modify structures while tracking the history of modifications.

Standard Transformations

Located in pymatgen.transformations.standard_transformations.

SupercellTransformation

Create supercells with arbitrary scaling matrices.

from pymatgen.transformations.standard_transformations import SupercellTransformation

# Simple 2x2x2 supercell
trans = SupercellTransformation([[2,0,0], [0,2,0], [0,0,2]])
new_struct = trans.apply_transformation(struct)

# Non-orthogonal supercell
trans = SupercellTransformation([[2,1,0], [0,2,0], [0,0,2]])
new_struct = trans.apply_transformation(struct)

SubstitutionTransformation

Replace species in a structure.

from pymatgen.transformations.standard_transformations import SubstitutionTransformation

# Replace all Fe with Mn
trans = SubstitutionTransformation({"Fe": "Mn"})
new_struct = trans.apply_transformation(struct)

# Partial substitution (50% Fe -> Mn)
trans = SubstitutionTransformation({"Fe": {"Mn": 0.5, "Fe": 0.5}})
new_struct = trans.apply_transformation(struct)

RemoveSpeciesTransformation

Remove specific species from structure.

from pymatgen.transformations.standard_transformations import RemoveSpeciesTransformation

trans = RemoveSpeciesTransformation(["H"])  # Remove all hydrogen
new_struct = trans.apply_transformation(struct)

OrderDisorderedStructureTransformation

Order disordered structures with partial occupancies.

from pymatgen.transformations.standard_transformations import OrderDisorderedStructureTransformation

trans = OrderDisorderedStructureTransformation()
new_struct = trans.apply_transformation(disordered_struct)

PrimitiveCellTransformation

Convert to primitive cell.

from pymatgen.transformations.standard_transformations import PrimitiveCellTransformation

trans = PrimitiveCellTransformation()
primitive_struct = trans.apply_transformation(struct)

ConventionalCellTransformation

Convert to conventional cell.

from pymatgen.transformations.standard_transformations import ConventionalCellTransformation

trans = ConventionalCellTransformation()
conventional_struct = trans.apply_transformation(struct)

RotationTransformation

Rotate structure.

from pymatgen.transformations.standard_transformations import RotationTransformation

# Rotate by axis and angle
trans = RotationTransformation([0, 0, 1], 45)  # 45° around z-axis
new_struct = trans.apply_transformation(struct)

ScaleToRelaxedTransformation

Scale lattice to match a relaxed structure.

from pymatgen.transformations.standard_transformations import ScaleToRelaxedTransformation

trans = ScaleToRelaxedTransformation(relaxed_struct)
scaled_struct = trans.apply_transformation(unrelaxed_struct)

Advanced Transformations

Located in pymatgen.transformations.advanced_transformations.

EnumerateStructureTransformation

Enumerate all symmetrically distinct ordered structures from a disordered structure.

from pymatgen.transformations.advanced_transformations import EnumerateStructureTransformation

# Enumerate structures up to max 8 atoms per unit cell
trans = EnumerateStructureTransformation(max_cell_size=8)
structures = trans.apply_transformation(struct, return_ranked_list=True)

# Returns list of ranked structures
for s in structures[:5]:  # Top 5 structures
    print(f"Energy: {s['energy']}, Structure: {s['structure']}")

MagOrderingTransformation

Enumerate magnetic orderings.

from pymatgen.transformations.advanced_transformations import MagOrderingTransformation

# Specify magnetic moments for each species
trans = MagOrderingTransformation({"Fe": 5.0, "Ni": 2.0})
mag_structures = trans.apply_transformation(struct, return_ranked_list=True)

DopingTransformation

Systematically dope a structure.

from pymatgen.transformations.advanced_transformations import DopingTransformation

# Replace 12.5% of Fe sites with Mn
trans = DopingTransformation("Mn", min_length=10)
doped_structs = trans.apply_transformation(struct, return_ranked_list=True)

ChargeBalanceTransformation

Balance charge in a structure by oxidation state manipulation.

from pymatgen.transformations.advanced_transformations import ChargeBalanceTransformation

trans = ChargeBalanceTransformation("Li")
charged_struct = trans.apply_transformation(struct)

SlabTransformation

Generate surface slabs.

from pymatgen.transformations.advanced_transformations import SlabTransformation

trans = SlabTransformation(
    miller_index=[1, 0, 0],
    min_slab_size=10,
    min_vacuum_size=10,
    shift=0,
    lll_reduce=True
)
slab = trans.apply_transformation(struct)

Chaining Transformations

from pymatgen.alchemy.materials import TransformedStructure

# Create transformed structure that tracks history
ts = TransformedStructure(struct, [])

# Apply multiple transformations
ts.append_transformation(SupercellTransformation([[2,0,0],[0,2,0],[0,0,2]]))
ts.append_transformation(SubstitutionTransformation({"Fe": "Mn"}))
ts.append_transformation(PrimitiveCellTransformation())

# Get final structure
final_struct = ts.final_structure

# View transformation history
print(ts.history)

Common Workflows

Workflow 1: High-Throughput Structure Generation

Generate multiple structures for screening studies.

from pymatgen.core import Structure
from pymatgen.transformations.standard_transformations import (
    SubstitutionTransformation,
    SupercellTransformation
)
from pymatgen.io.vasp.sets import MPRelaxSet

# Starting structure
base_struct = Structure.from_file("POSCAR")

# Define substitutions
dopants = ["Mn", "Co", "Ni", "Cu"]
structures = {}

for dopant in dopants:
    # Create substituted structure
    trans = SubstitutionTransformation({"Fe": dopant})
    new_struct = trans.apply_transformation(base_struct)

    # Generate VASP inputs
    vasp_input = MPRelaxSet(new_struct)
    vasp_input.write_input(f"./calcs/Fe_{dopant}")

    structures[dopant] = new_struct

print(f"Generated {len(structures)} structures")

Workflow 2: Phase Diagram Construction

Build and analyze phase diagrams from Materials Project data.

from mp_api.client import MPRester
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter
from pymatgen.core import Composition

# Get data from Materials Project
with MPRester() as mpr:
    entries = mpr.get_entries_in_chemsys("Li-Fe-O")

# Build phase diagram
pd = PhaseDiagram(entries)

# Analyze specific composition
comp = Composition("LiFeO2")
e_above_hull = pd.get_e_above_hull(entries[0])

# Get decomposition products
decomp = pd.get_decomposition(comp)
print(f"Decomposition: {decomp}")

# Visualize
plotter = PDPlotter(pd)
plotter.show()

Workflow 3: Surface Energy Calculation

Calculate surface energies from slab calculations.

from pymatgen.core.surface import SlabGenerator, generate_all_slabs
from pymatgen.io.vasp.sets import MPStaticSet, MPRelaxSet
from pymatgen.core import Structure

# Read bulk structure
bulk = Structure.from_file("bulk_POSCAR")

# Get bulk energy (from previous calculation)
from pymatgen.io.vasp import Vasprun
bulk_vasprun = Vasprun("bulk/vasprun.xml")
bulk_energy_per_atom = bulk_vasprun.final_energy / len(bulk)

# Generate slabs
miller_indices = [(1,0,0), (1,1,0), (1,1,1)]
surface_energies = {}

for miller in miller_indices:
    slabgen = SlabGenerator(
        bulk,
        miller_index=miller,
        min_slab_size=10,
        min_vacuum_size=15,
        center_slab=True
    )

    slab = slabgen.get_slabs()[0]

    # Write VASP input for slab
    relax = MPRelaxSet(slab)
    relax.write_input(f"./slab_{miller[0]}{miller[1]}{miller[2]}")

    # After calculation, compute surface energy:
    # slab_vasprun = Vasprun(f"slab_{miller[0]}{miller[1]}{miller[2]}/vasprun.xml")
    # slab_energy = slab_vasprun.final_energy
    # n_atoms = len(slab)
    # area = slab.surface_area  # in Ų
    #
    # # Surface energy (J/m²)
    # surf_energy = (slab_energy - n_atoms * bulk_energy_per_atom) / (2 * area)
    # surf_energy *= 16.021766  # Convert eV/Ų to J/m²
    # surface_energies[miller] = surf_energy

print(f"Set up calculations for {len(miller_indices)} surfaces")

Workflow 4: Band Structure Calculation

Complete workflow for band structure calculations.

from pymatgen.core import Structure
from pymatgen.io.vasp.sets import MPRelaxSet, MPStaticSet, MPNonSCFSet
from pymatgen.symmetry.bandstructure import HighSymmKpath

# Step 1: Relaxation
struct = Structure.from_file("initial_POSCAR")
relax = MPRelaxSet(struct)
relax.write_input("./1_relax")

# After relaxation, read structure
relaxed_struct = Structure.from_file("1_relax/CONTCAR")

# Step 2: Static calculation
static = MPStaticSet(relaxed_struct)
static.write_input("./2_static")

# Step 3: Band structure (non-self-consistent)
kpath = HighSymmKpath(relaxed_struct)
nscf = MPNonSCFSet(relaxed_struct, mode="line")  # Band structure mode
nscf.write_input("./3_bandstructure")

# After calculations, analyze
from pymatgen.io.vasp import Vasprun
from pymatgen.electronic_structure.plotter import BSPlotter

vasprun = Vasprun("3_bandstructure/vasprun.xml")
bs = vasprun.get_band_structure(line_mode=True)

print(f"Band gap: {bs.get_band_gap()}")

plotter = BSPlotter(bs)
plotter.save_plot("band_structure.png")

Workflow 5: Molecular Dynamics Setup

Set up and analyze molecular dynamics simulations.

from pymatgen.core import Structure
from pymatgen.io.vasp.sets import MVLRelaxSet
from pymatgen.io.vasp.inputs import Incar

# Read structure
struct = Structure.from_file("POSCAR")

# Create 2x2x2 supercell for MD
from pymatgen.transformations.standard_transformations import SupercellTransformation
trans = SupercellTransformation([[2,0,0],[0,2,0],[0,0,2]])
supercell = trans.apply_transformation(struct)

# Set up VASP input
md_input = MVLRelaxSet(supercell)

# Modify INCAR for MD
incar = md_input.incar
incar.update({
    "IBRION": 0,      # Molecular dynamics
    "NSW": 1000,      # Number of steps
    "POTIM": 2,       # Time step (fs)
    "TEBEG": 300,     # Initial temperature (K)
    "TEEND": 300,     # Final temperature (K)
    "SMASS": 0,       # NVT ensemble
    "MDALGO": 2,      # Nose-Hoover thermostat
})

md_input.incar = incar
md_input.write_input("./md_calc")

Workflow 6: Diffusion Analysis

Analyze ion diffusion from AIMD trajectories.

from pymatgen.io.vasp import Xdatcar
from pymatgen.analysis.diffusion.analyzer import DiffusionAnalyzer

# Read trajectory from XDATCAR
xdatcar = Xdatcar("XDATCAR")
structures = xdatcar.structures

# Analyze diffusion for specific species (e.g., Li)
analyzer = DiffusionAnalyzer.from_structures(
    structures,
    specie="Li",
    temperature=300,  # K
    time_step=2,      # fs
    step_skip=10      # Skip initial equilibration
)

# Get diffusivity
diffusivity = analyzer.diffusivity  # cm²/s
conductivity = analyzer.conductivity  # mS/cm

# Get mean squared displacement
msd = analyzer.msd

# Plot MSD
analyzer.plot_msd()

print(f"Diffusivity: {diffusivity:.2e} cm²/s")
print(f"Conductivity: {conductivity:.2e} mS/cm")

Workflow 7: Structure Prediction and Enumeration

Predict and enumerate possible structures.

from pymatgen.core import Structure, Lattice
from pymatgen.transformations.advanced_transformations import (
    EnumerateStructureTransformation,
    SubstitutionTransformation
)

# Start with a known structure type (e.g., rocksalt)
lattice = Lattice.cubic(4.2)
struct = Structure.from_spacegroup("Fm-3m", lattice, ["Li", "O"], [[0,0,0], [0.5,0.5,0.5]])

# Create disordered structure
from pymatgen.core import Species
species_on_site = {Species("Li"): 0.5, Species("Na"): 0.5}
struct[0] = species_on_site  # Mixed occupancy on Li site

# Enumerate all ordered structures
trans = EnumerateStructureTransformation(max_cell_size=4)
ordered_structs = trans.apply_transformation(struct, return_ranked_list=True)

print(f"Found {len(ordered_structs)} distinct ordered structures")

# Write all structures
for i, s_dict in enumerate(ordered_structs[:10]):  # Top 10
    s_dict['structure'].to(filename=f"ordered_struct_{i}.cif")

Workflow 8: Elastic Constant Calculation

Calculate elastic constants using the stress-strain method.

from pymatgen.core import Structure
from pymatgen.transformations.standard_transformations import DeformStructureTransformation
from pymatgen.io.vasp.sets import MPStaticSet

# Read equilibrium structure
struct = Structure.from_file("relaxed_POSCAR")

# Generate deformed structures
strains = [0.00, 0.01, 0.02, -0.01, -0.02]  # Applied strains
deformation_sets = []

for strain in strains:
    # Apply strain in different directions
    trans = DeformStructureTransformation([[1+strain, 0, 0], [0, 1, 0], [0, 0, 1]])
    deformed = trans.apply_transformation(struct)

    # Set up VASP calculation
    static = MPStaticSet(deformed)
    static.write_input(f"./strain_{strain:.2f}")

# After calculations, fit stress vs strain to get elastic constants
# from pymatgen.analysis.elasticity import ElasticTensor
# ... (collect stress tensors from OUTCAR)
# elastic_tensor = ElasticTensor.from_stress_list(stress_list)

Workflow 9: Adsorption Energy Calculation

Calculate adsorption energies on surfaces.

from pymatgen.core import Structure, Molecule
from pymatgen.core.surface import SlabGenerator
from pymatgen.analysis.adsorption import AdsorbateSiteFinder
from pymatgen.io.vasp.sets import MPRelaxSet

# Generate slab
bulk = Structure.from_file("bulk_POSCAR")
slabgen = SlabGenerator(bulk, (1,1,1), 10, 10)
slab = slabgen.get_slabs()[0]

# Find adsorption sites
asf = AdsorbateSiteFinder(slab)
ads_sites = asf.find_adsorption_sites()

# Create adsorbate
adsorbate = Molecule("O", [[0, 0, 0]])

# Generate structures with adsorbate
ads_structs = asf.add_adsorbate(adsorbate, ads_sites["ontop"][0])

# Set up calculations
relax_slab = MPRelaxSet(slab)
relax_slab.write_input("./slab")

relax_ads = MPRelaxSet(ads_structs)
relax_ads.write_input("./slab_with_adsorbate")

# After calculations:
# E_ads = E(slab+adsorbate) - E(slab) - E(adsorbate_gas)

Workflow 10: High-Throughput Materials Screening

Screen materials database for specific properties.

from mp_api.client import MPRester
from pymatgen.core import Structure
import pandas as pd

# Define screening criteria
def screen_material(material):
    """Screen for potential battery cathode materials"""
    criteria = {
        "has_li": "Li" in material.composition.elements,
        "stable": material.energy_above_hull < 0.05,
        "good_voltage": 2.5 < material.formation_energy_per_atom < 4.5,
        "electronically_conductive": material.band_gap < 0.5
    }
    return all(criteria.values()), criteria

# Query Materials Project
with MPRester() as mpr:
    # Get potential materials
    materials = mpr.materials.summary.search(
        elements=["Li"],
        energy_above_hull=(0, 0.05),
    )

    results = []
    for mat in materials:
        passes, criteria = screen_material(mat)
        if passes:
            results.append({
                "material_id": mat.material_id,
                "formula": mat.formula_pretty,
                "energy_above_hull": mat.energy_above_hull,
                "band_gap": mat.band_gap,
            })

    # Save results
    df = pd.DataFrame(results)
    df.to_csv("screened_materials.csv", index=False)

    print(f"Found {len(results)} promising materials")

Best Practices for Workflows

  1. Modular design: Break workflows into discrete steps
  2. Error handling: Check file existence and calculation convergence
  3. Documentation: Track transformation history using TransformedStructure
  4. Version control: Store input parameters and scripts in git
  5. Automation: Use workflow managers (Fireworks, AiiDA) for complex pipelines
  6. Data management: Organize calculations in clear directory structures
  7. Validation: Always validate intermediate results before proceeding

Integration with Workflow Tools

Pymatgen integrates with several workflow management systems:

  • Atomate: Pre-built VASP workflows
  • Fireworks: Workflow execution engine
  • AiiDA: Provenance tracking and workflow management
  • Custodian: Error correction and job monitoring

These tools provide robust automation for production calculations.