Files
gh-k-dense-ai-claude-scient…/skills/pymatgen/references/transformations_workflows.md
2025-11-30 08:30:10 +08:00

592 lines
16 KiB
Markdown

# Pymatgen Transformations and Common Workflows
This reference documents pymatgen's transformation framework and provides recipes for common materials science workflows.
## Transformation Framework
Transformations provide a systematic way to modify structures while tracking the history of modifications.
### Standard Transformations
Located in `pymatgen.transformations.standard_transformations`.
#### SupercellTransformation
Create supercells with arbitrary scaling matrices.
```python
from pymatgen.transformations.standard_transformations import SupercellTransformation
# Simple 2x2x2 supercell
trans = SupercellTransformation([[2,0,0], [0,2,0], [0,0,2]])
new_struct = trans.apply_transformation(struct)
# Non-orthogonal supercell
trans = SupercellTransformation([[2,1,0], [0,2,0], [0,0,2]])
new_struct = trans.apply_transformation(struct)
```
#### SubstitutionTransformation
Replace species in a structure.
```python
from pymatgen.transformations.standard_transformations import SubstitutionTransformation
# Replace all Fe with Mn
trans = SubstitutionTransformation({"Fe": "Mn"})
new_struct = trans.apply_transformation(struct)
# Partial substitution (50% Fe -> Mn)
trans = SubstitutionTransformation({"Fe": {"Mn": 0.5, "Fe": 0.5}})
new_struct = trans.apply_transformation(struct)
```
#### RemoveSpeciesTransformation
Remove specific species from structure.
```python
from pymatgen.transformations.standard_transformations import RemoveSpeciesTransformation
trans = RemoveSpeciesTransformation(["H"]) # Remove all hydrogen
new_struct = trans.apply_transformation(struct)
```
#### OrderDisorderedStructureTransformation
Order disordered structures with partial occupancies.
```python
from pymatgen.transformations.standard_transformations import OrderDisorderedStructureTransformation
trans = OrderDisorderedStructureTransformation()
new_struct = trans.apply_transformation(disordered_struct)
```
#### PrimitiveCellTransformation
Convert to primitive cell.
```python
from pymatgen.transformations.standard_transformations import PrimitiveCellTransformation
trans = PrimitiveCellTransformation()
primitive_struct = trans.apply_transformation(struct)
```
#### ConventionalCellTransformation
Convert to conventional cell.
```python
from pymatgen.transformations.standard_transformations import ConventionalCellTransformation
trans = ConventionalCellTransformation()
conventional_struct = trans.apply_transformation(struct)
```
#### RotationTransformation
Rotate structure.
```python
from pymatgen.transformations.standard_transformations import RotationTransformation
# Rotate by axis and angle
trans = RotationTransformation([0, 0, 1], 45) # 45° around z-axis
new_struct = trans.apply_transformation(struct)
```
#### ScaleToRelaxedTransformation
Scale lattice to match a relaxed structure.
```python
from pymatgen.transformations.standard_transformations import ScaleToRelaxedTransformation
trans = ScaleToRelaxedTransformation(relaxed_struct)
scaled_struct = trans.apply_transformation(unrelaxed_struct)
```
### Advanced Transformations
Located in `pymatgen.transformations.advanced_transformations`.
#### EnumerateStructureTransformation
Enumerate all symmetrically distinct ordered structures from a disordered structure.
```python
from pymatgen.transformations.advanced_transformations import EnumerateStructureTransformation
# Enumerate structures up to max 8 atoms per unit cell
trans = EnumerateStructureTransformation(max_cell_size=8)
structures = trans.apply_transformation(struct, return_ranked_list=True)
# Returns list of ranked structures
for s in structures[:5]: # Top 5 structures
print(f"Energy: {s['energy']}, Structure: {s['structure']}")
```
#### MagOrderingTransformation
Enumerate magnetic orderings.
```python
from pymatgen.transformations.advanced_transformations import MagOrderingTransformation
# Specify magnetic moments for each species
trans = MagOrderingTransformation({"Fe": 5.0, "Ni": 2.0})
mag_structures = trans.apply_transformation(struct, return_ranked_list=True)
```
#### DopingTransformation
Systematically dope a structure.
```python
from pymatgen.transformations.advanced_transformations import DopingTransformation
# Replace 12.5% of Fe sites with Mn
trans = DopingTransformation("Mn", min_length=10)
doped_structs = trans.apply_transformation(struct, return_ranked_list=True)
```
#### ChargeBalanceTransformation
Balance charge in a structure by oxidation state manipulation.
```python
from pymatgen.transformations.advanced_transformations import ChargeBalanceTransformation
trans = ChargeBalanceTransformation("Li")
charged_struct = trans.apply_transformation(struct)
```
#### SlabTransformation
Generate surface slabs.
```python
from pymatgen.transformations.advanced_transformations import SlabTransformation
trans = SlabTransformation(
miller_index=[1, 0, 0],
min_slab_size=10,
min_vacuum_size=10,
shift=0,
lll_reduce=True
)
slab = trans.apply_transformation(struct)
```
### Chaining Transformations
```python
from pymatgen.alchemy.materials import TransformedStructure
# Create transformed structure that tracks history
ts = TransformedStructure(struct, [])
# Apply multiple transformations
ts.append_transformation(SupercellTransformation([[2,0,0],[0,2,0],[0,0,2]]))
ts.append_transformation(SubstitutionTransformation({"Fe": "Mn"}))
ts.append_transformation(PrimitiveCellTransformation())
# Get final structure
final_struct = ts.final_structure
# View transformation history
print(ts.history)
```
## Common Workflows
### Workflow 1: High-Throughput Structure Generation
Generate multiple structures for screening studies.
```python
from pymatgen.core import Structure
from pymatgen.transformations.standard_transformations import (
SubstitutionTransformation,
SupercellTransformation
)
from pymatgen.io.vasp.sets import MPRelaxSet
# Starting structure
base_struct = Structure.from_file("POSCAR")
# Define substitutions
dopants = ["Mn", "Co", "Ni", "Cu"]
structures = {}
for dopant in dopants:
# Create substituted structure
trans = SubstitutionTransformation({"Fe": dopant})
new_struct = trans.apply_transformation(base_struct)
# Generate VASP inputs
vasp_input = MPRelaxSet(new_struct)
vasp_input.write_input(f"./calcs/Fe_{dopant}")
structures[dopant] = new_struct
print(f"Generated {len(structures)} structures")
```
### Workflow 2: Phase Diagram Construction
Build and analyze phase diagrams from Materials Project data.
```python
from mp_api.client import MPRester
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter
from pymatgen.core import Composition
# Get data from Materials Project
with MPRester() as mpr:
entries = mpr.get_entries_in_chemsys("Li-Fe-O")
# Build phase diagram
pd = PhaseDiagram(entries)
# Analyze specific composition
comp = Composition("LiFeO2")
e_above_hull = pd.get_e_above_hull(entries[0])
# Get decomposition products
decomp = pd.get_decomposition(comp)
print(f"Decomposition: {decomp}")
# Visualize
plotter = PDPlotter(pd)
plotter.show()
```
### Workflow 3: Surface Energy Calculation
Calculate surface energies from slab calculations.
```python
from pymatgen.core.surface import SlabGenerator, generate_all_slabs
from pymatgen.io.vasp.sets import MPStaticSet, MPRelaxSet
from pymatgen.core import Structure
# Read bulk structure
bulk = Structure.from_file("bulk_POSCAR")
# Get bulk energy (from previous calculation)
from pymatgen.io.vasp import Vasprun
bulk_vasprun = Vasprun("bulk/vasprun.xml")
bulk_energy_per_atom = bulk_vasprun.final_energy / len(bulk)
# Generate slabs
miller_indices = [(1,0,0), (1,1,0), (1,1,1)]
surface_energies = {}
for miller in miller_indices:
slabgen = SlabGenerator(
bulk,
miller_index=miller,
min_slab_size=10,
min_vacuum_size=15,
center_slab=True
)
slab = slabgen.get_slabs()[0]
# Write VASP input for slab
relax = MPRelaxSet(slab)
relax.write_input(f"./slab_{miller[0]}{miller[1]}{miller[2]}")
# After calculation, compute surface energy:
# slab_vasprun = Vasprun(f"slab_{miller[0]}{miller[1]}{miller[2]}/vasprun.xml")
# slab_energy = slab_vasprun.final_energy
# n_atoms = len(slab)
# area = slab.surface_area # in Ų
#
# # Surface energy (J/m²)
# surf_energy = (slab_energy - n_atoms * bulk_energy_per_atom) / (2 * area)
# surf_energy *= 16.021766 # Convert eV/Ų to J/m²
# surface_energies[miller] = surf_energy
print(f"Set up calculations for {len(miller_indices)} surfaces")
```
### Workflow 4: Band Structure Calculation
Complete workflow for band structure calculations.
```python
from pymatgen.core import Structure
from pymatgen.io.vasp.sets import MPRelaxSet, MPStaticSet, MPNonSCFSet
from pymatgen.symmetry.bandstructure import HighSymmKpath
# Step 1: Relaxation
struct = Structure.from_file("initial_POSCAR")
relax = MPRelaxSet(struct)
relax.write_input("./1_relax")
# After relaxation, read structure
relaxed_struct = Structure.from_file("1_relax/CONTCAR")
# Step 2: Static calculation
static = MPStaticSet(relaxed_struct)
static.write_input("./2_static")
# Step 3: Band structure (non-self-consistent)
kpath = HighSymmKpath(relaxed_struct)
nscf = MPNonSCFSet(relaxed_struct, mode="line") # Band structure mode
nscf.write_input("./3_bandstructure")
# After calculations, analyze
from pymatgen.io.vasp import Vasprun
from pymatgen.electronic_structure.plotter import BSPlotter
vasprun = Vasprun("3_bandstructure/vasprun.xml")
bs = vasprun.get_band_structure(line_mode=True)
print(f"Band gap: {bs.get_band_gap()}")
plotter = BSPlotter(bs)
plotter.save_plot("band_structure.png")
```
### Workflow 5: Molecular Dynamics Setup
Set up and analyze molecular dynamics simulations.
```python
from pymatgen.core import Structure
from pymatgen.io.vasp.sets import MVLRelaxSet
from pymatgen.io.vasp.inputs import Incar
# Read structure
struct = Structure.from_file("POSCAR")
# Create 2x2x2 supercell for MD
from pymatgen.transformations.standard_transformations import SupercellTransformation
trans = SupercellTransformation([[2,0,0],[0,2,0],[0,0,2]])
supercell = trans.apply_transformation(struct)
# Set up VASP input
md_input = MVLRelaxSet(supercell)
# Modify INCAR for MD
incar = md_input.incar
incar.update({
"IBRION": 0, # Molecular dynamics
"NSW": 1000, # Number of steps
"POTIM": 2, # Time step (fs)
"TEBEG": 300, # Initial temperature (K)
"TEEND": 300, # Final temperature (K)
"SMASS": 0, # NVT ensemble
"MDALGO": 2, # Nose-Hoover thermostat
})
md_input.incar = incar
md_input.write_input("./md_calc")
```
### Workflow 6: Diffusion Analysis
Analyze ion diffusion from AIMD trajectories.
```python
from pymatgen.io.vasp import Xdatcar
from pymatgen.analysis.diffusion.analyzer import DiffusionAnalyzer
# Read trajectory from XDATCAR
xdatcar = Xdatcar("XDATCAR")
structures = xdatcar.structures
# Analyze diffusion for specific species (e.g., Li)
analyzer = DiffusionAnalyzer.from_structures(
structures,
specie="Li",
temperature=300, # K
time_step=2, # fs
step_skip=10 # Skip initial equilibration
)
# Get diffusivity
diffusivity = analyzer.diffusivity # cm²/s
conductivity = analyzer.conductivity # mS/cm
# Get mean squared displacement
msd = analyzer.msd
# Plot MSD
analyzer.plot_msd()
print(f"Diffusivity: {diffusivity:.2e} cm²/s")
print(f"Conductivity: {conductivity:.2e} mS/cm")
```
### Workflow 7: Structure Prediction and Enumeration
Predict and enumerate possible structures.
```python
from pymatgen.core import Structure, Lattice
from pymatgen.transformations.advanced_transformations import (
EnumerateStructureTransformation,
SubstitutionTransformation
)
# Start with a known structure type (e.g., rocksalt)
lattice = Lattice.cubic(4.2)
struct = Structure.from_spacegroup("Fm-3m", lattice, ["Li", "O"], [[0,0,0], [0.5,0.5,0.5]])
# Create disordered structure
from pymatgen.core import Species
species_on_site = {Species("Li"): 0.5, Species("Na"): 0.5}
struct[0] = species_on_site # Mixed occupancy on Li site
# Enumerate all ordered structures
trans = EnumerateStructureTransformation(max_cell_size=4)
ordered_structs = trans.apply_transformation(struct, return_ranked_list=True)
print(f"Found {len(ordered_structs)} distinct ordered structures")
# Write all structures
for i, s_dict in enumerate(ordered_structs[:10]): # Top 10
s_dict['structure'].to(filename=f"ordered_struct_{i}.cif")
```
### Workflow 8: Elastic Constant Calculation
Calculate elastic constants using the stress-strain method.
```python
from pymatgen.core import Structure
from pymatgen.transformations.standard_transformations import DeformStructureTransformation
from pymatgen.io.vasp.sets import MPStaticSet
# Read equilibrium structure
struct = Structure.from_file("relaxed_POSCAR")
# Generate deformed structures
strains = [0.00, 0.01, 0.02, -0.01, -0.02] # Applied strains
deformation_sets = []
for strain in strains:
# Apply strain in different directions
trans = DeformStructureTransformation([[1+strain, 0, 0], [0, 1, 0], [0, 0, 1]])
deformed = trans.apply_transformation(struct)
# Set up VASP calculation
static = MPStaticSet(deformed)
static.write_input(f"./strain_{strain:.2f}")
# After calculations, fit stress vs strain to get elastic constants
# from pymatgen.analysis.elasticity import ElasticTensor
# ... (collect stress tensors from OUTCAR)
# elastic_tensor = ElasticTensor.from_stress_list(stress_list)
```
### Workflow 9: Adsorption Energy Calculation
Calculate adsorption energies on surfaces.
```python
from pymatgen.core import Structure, Molecule
from pymatgen.core.surface import SlabGenerator
from pymatgen.analysis.adsorption import AdsorbateSiteFinder
from pymatgen.io.vasp.sets import MPRelaxSet
# Generate slab
bulk = Structure.from_file("bulk_POSCAR")
slabgen = SlabGenerator(bulk, (1,1,1), 10, 10)
slab = slabgen.get_slabs()[0]
# Find adsorption sites
asf = AdsorbateSiteFinder(slab)
ads_sites = asf.find_adsorption_sites()
# Create adsorbate
adsorbate = Molecule("O", [[0, 0, 0]])
# Generate structures with adsorbate
ads_structs = asf.add_adsorbate(adsorbate, ads_sites["ontop"][0])
# Set up calculations
relax_slab = MPRelaxSet(slab)
relax_slab.write_input("./slab")
relax_ads = MPRelaxSet(ads_structs)
relax_ads.write_input("./slab_with_adsorbate")
# After calculations:
# E_ads = E(slab+adsorbate) - E(slab) - E(adsorbate_gas)
```
### Workflow 10: High-Throughput Materials Screening
Screen materials database for specific properties.
```python
from mp_api.client import MPRester
from pymatgen.core import Structure
import pandas as pd
# Define screening criteria
def screen_material(material):
"""Screen for potential battery cathode materials"""
criteria = {
"has_li": "Li" in material.composition.elements,
"stable": material.energy_above_hull < 0.05,
"good_voltage": 2.5 < material.formation_energy_per_atom < 4.5,
"electronically_conductive": material.band_gap < 0.5
}
return all(criteria.values()), criteria
# Query Materials Project
with MPRester() as mpr:
# Get potential materials
materials = mpr.materials.summary.search(
elements=["Li"],
energy_above_hull=(0, 0.05),
)
results = []
for mat in materials:
passes, criteria = screen_material(mat)
if passes:
results.append({
"material_id": mat.material_id,
"formula": mat.formula_pretty,
"energy_above_hull": mat.energy_above_hull,
"band_gap": mat.band_gap,
})
# Save results
df = pd.DataFrame(results)
df.to_csv("screened_materials.csv", index=False)
print(f"Found {len(results)} promising materials")
```
## Best Practices for Workflows
1. **Modular design**: Break workflows into discrete steps
2. **Error handling**: Check file existence and calculation convergence
3. **Documentation**: Track transformation history using `TransformedStructure`
4. **Version control**: Store input parameters and scripts in git
5. **Automation**: Use workflow managers (Fireworks, AiiDA) for complex pipelines
6. **Data management**: Organize calculations in clear directory structures
7. **Validation**: Always validate intermediate results before proceeding
## Integration with Workflow Tools
Pymatgen integrates with several workflow management systems:
- **Atomate**: Pre-built VASP workflows
- **Fireworks**: Workflow execution engine
- **AiiDA**: Provenance tracking and workflow management
- **Custodian**: Error correction and job monitoring
These tools provide robust automation for production calculations.