Initial commit
This commit is contained in:
457
skills/cobrapy/SKILL.md
Normal file
457
skills/cobrapy/SKILL.md
Normal file
@@ -0,0 +1,457 @@
|
||||
---
|
||||
name: cobrapy
|
||||
description: "Constraint-based metabolic modeling (COBRA). FBA, FVA, gene knockouts, flux sampling, SBML models, for systems biology and metabolic engineering analysis."
|
||||
---
|
||||
|
||||
# COBRApy - Constraint-Based Reconstruction and Analysis
|
||||
|
||||
## Overview
|
||||
|
||||
COBRApy is a Python library for constraint-based reconstruction and analysis (COBRA) of metabolic models, essential for systems biology research. Work with genome-scale metabolic models, perform computational simulations of cellular metabolism, conduct metabolic engineering analyses, and predict phenotypic behaviors.
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
COBRApy provides comprehensive tools organized into several key areas:
|
||||
|
||||
### 1. Model Management
|
||||
|
||||
Load existing models from repositories or files:
|
||||
```python
|
||||
from cobra.io import load_model
|
||||
|
||||
# Load bundled test models
|
||||
model = load_model("textbook") # E. coli core model
|
||||
model = load_model("ecoli") # Full E. coli model
|
||||
model = load_model("salmonella")
|
||||
|
||||
# Load from files
|
||||
from cobra.io import read_sbml_model, load_json_model, load_yaml_model
|
||||
model = read_sbml_model("path/to/model.xml")
|
||||
model = load_json_model("path/to/model.json")
|
||||
model = load_yaml_model("path/to/model.yml")
|
||||
```
|
||||
|
||||
Save models in various formats:
|
||||
```python
|
||||
from cobra.io import write_sbml_model, save_json_model, save_yaml_model
|
||||
write_sbml_model(model, "output.xml") # Preferred format
|
||||
save_json_model(model, "output.json") # For Escher compatibility
|
||||
save_yaml_model(model, "output.yml") # Human-readable
|
||||
```
|
||||
|
||||
### 2. Model Structure and Components
|
||||
|
||||
Access and inspect model components:
|
||||
```python
|
||||
# Access components
|
||||
model.reactions # DictList of all reactions
|
||||
model.metabolites # DictList of all metabolites
|
||||
model.genes # DictList of all genes
|
||||
|
||||
# Get specific items by ID or index
|
||||
reaction = model.reactions.get_by_id("PFK")
|
||||
metabolite = model.metabolites[0]
|
||||
|
||||
# Inspect properties
|
||||
print(reaction.reaction) # Stoichiometric equation
|
||||
print(reaction.bounds) # Flux constraints
|
||||
print(reaction.gene_reaction_rule) # GPR logic
|
||||
print(metabolite.formula) # Chemical formula
|
||||
print(metabolite.compartment) # Cellular location
|
||||
```
|
||||
|
||||
### 3. Flux Balance Analysis (FBA)
|
||||
|
||||
Perform standard FBA simulation:
|
||||
```python
|
||||
# Basic optimization
|
||||
solution = model.optimize()
|
||||
print(f"Objective value: {solution.objective_value}")
|
||||
print(f"Status: {solution.status}")
|
||||
|
||||
# Access fluxes
|
||||
print(solution.fluxes["PFK"])
|
||||
print(solution.fluxes.head())
|
||||
|
||||
# Fast optimization (objective value only)
|
||||
objective_value = model.slim_optimize()
|
||||
|
||||
# Change objective
|
||||
model.objective = "ATPM"
|
||||
solution = model.optimize()
|
||||
```
|
||||
|
||||
Parsimonious FBA (minimize total flux):
|
||||
```python
|
||||
from cobra.flux_analysis import pfba
|
||||
solution = pfba(model)
|
||||
```
|
||||
|
||||
Geometric FBA (find central solution):
|
||||
```python
|
||||
from cobra.flux_analysis import geometric_fba
|
||||
solution = geometric_fba(model)
|
||||
```
|
||||
|
||||
### 4. Flux Variability Analysis (FVA)
|
||||
|
||||
Determine flux ranges for all reactions:
|
||||
```python
|
||||
from cobra.flux_analysis import flux_variability_analysis
|
||||
|
||||
# Standard FVA
|
||||
fva_result = flux_variability_analysis(model)
|
||||
|
||||
# FVA at 90% optimality
|
||||
fva_result = flux_variability_analysis(model, fraction_of_optimum=0.9)
|
||||
|
||||
# Loopless FVA (eliminates thermodynamically infeasible loops)
|
||||
fva_result = flux_variability_analysis(model, loopless=True)
|
||||
|
||||
# FVA for specific reactions
|
||||
fva_result = flux_variability_analysis(
|
||||
model,
|
||||
reaction_list=["PFK", "FBA", "PGI"]
|
||||
)
|
||||
```
|
||||
|
||||
### 5. Gene and Reaction Deletion Studies
|
||||
|
||||
Perform knockout analyses:
|
||||
```python
|
||||
from cobra.flux_analysis import (
|
||||
single_gene_deletion,
|
||||
single_reaction_deletion,
|
||||
double_gene_deletion,
|
||||
double_reaction_deletion
|
||||
)
|
||||
|
||||
# Single deletions
|
||||
gene_results = single_gene_deletion(model)
|
||||
reaction_results = single_reaction_deletion(model)
|
||||
|
||||
# Double deletions (uses multiprocessing)
|
||||
double_gene_results = double_gene_deletion(
|
||||
model,
|
||||
processes=4 # Number of CPU cores
|
||||
)
|
||||
|
||||
# Manual knockout using context manager
|
||||
with model:
|
||||
model.genes.get_by_id("b0008").knock_out()
|
||||
solution = model.optimize()
|
||||
print(f"Growth after knockout: {solution.objective_value}")
|
||||
# Model automatically reverts after context exit
|
||||
```
|
||||
|
||||
### 6. Growth Media and Minimal Media
|
||||
|
||||
Manage growth medium:
|
||||
```python
|
||||
# View current medium
|
||||
print(model.medium)
|
||||
|
||||
# Modify medium (must reassign entire dict)
|
||||
medium = model.medium
|
||||
medium["EX_glc__D_e"] = 10.0 # Set glucose uptake
|
||||
medium["EX_o2_e"] = 0.0 # Anaerobic conditions
|
||||
model.medium = medium
|
||||
|
||||
# Calculate minimal media
|
||||
from cobra.medium import minimal_medium
|
||||
|
||||
# Minimize total import flux
|
||||
min_medium = minimal_medium(model, minimize_components=False)
|
||||
|
||||
# Minimize number of components (uses MILP, slower)
|
||||
min_medium = minimal_medium(
|
||||
model,
|
||||
minimize_components=True,
|
||||
open_exchanges=True
|
||||
)
|
||||
```
|
||||
|
||||
### 7. Flux Sampling
|
||||
|
||||
Sample the feasible flux space:
|
||||
```python
|
||||
from cobra.sampling import sample
|
||||
|
||||
# Sample using OptGP (default, supports parallel processing)
|
||||
samples = sample(model, n=1000, method="optgp", processes=4)
|
||||
|
||||
# Sample using ACHR
|
||||
samples = sample(model, n=1000, method="achr")
|
||||
|
||||
# Validate samples
|
||||
from cobra.sampling import OptGPSampler
|
||||
sampler = OptGPSampler(model, processes=4)
|
||||
sampler.sample(1000)
|
||||
validation = sampler.validate(sampler.samples)
|
||||
print(validation.value_counts()) # Should be all 'v' for valid
|
||||
```
|
||||
|
||||
### 8. Production Envelopes
|
||||
|
||||
Calculate phenotype phase planes:
|
||||
```python
|
||||
from cobra.flux_analysis import production_envelope
|
||||
|
||||
# Standard production envelope
|
||||
envelope = production_envelope(
|
||||
model,
|
||||
reactions=["EX_glc__D_e", "EX_o2_e"],
|
||||
objective="EX_ac_e" # Acetate production
|
||||
)
|
||||
|
||||
# With carbon yield
|
||||
envelope = production_envelope(
|
||||
model,
|
||||
reactions=["EX_glc__D_e", "EX_o2_e"],
|
||||
carbon_sources="EX_glc__D_e"
|
||||
)
|
||||
|
||||
# Visualize (use matplotlib or pandas plotting)
|
||||
import matplotlib.pyplot as plt
|
||||
envelope.plot(x="EX_glc__D_e", y="EX_o2_e", kind="scatter")
|
||||
plt.show()
|
||||
```
|
||||
|
||||
### 9. Gapfilling
|
||||
|
||||
Add reactions to make models feasible:
|
||||
```python
|
||||
from cobra.flux_analysis import gapfill
|
||||
|
||||
# Prepare universal model with candidate reactions
|
||||
universal = load_model("universal")
|
||||
|
||||
# Perform gapfilling
|
||||
with model:
|
||||
# Remove reactions to create gaps for demonstration
|
||||
model.remove_reactions([model.reactions.PGI])
|
||||
|
||||
# Find reactions needed
|
||||
solution = gapfill(model, universal)
|
||||
print(f"Reactions to add: {solution}")
|
||||
```
|
||||
|
||||
### 10. Model Building
|
||||
|
||||
Build models from scratch:
|
||||
```python
|
||||
from cobra import Model, Reaction, Metabolite
|
||||
|
||||
# Create model
|
||||
model = Model("my_model")
|
||||
|
||||
# Create metabolites
|
||||
atp_c = Metabolite("atp_c", formula="C10H12N5O13P3",
|
||||
name="ATP", compartment="c")
|
||||
adp_c = Metabolite("adp_c", formula="C10H12N5O10P2",
|
||||
name="ADP", compartment="c")
|
||||
pi_c = Metabolite("pi_c", formula="HO4P",
|
||||
name="Phosphate", compartment="c")
|
||||
|
||||
# Create reaction
|
||||
reaction = Reaction("ATPASE")
|
||||
reaction.name = "ATP hydrolysis"
|
||||
reaction.subsystem = "Energy"
|
||||
reaction.lower_bound = 0.0
|
||||
reaction.upper_bound = 1000.0
|
||||
|
||||
# Add metabolites with stoichiometry
|
||||
reaction.add_metabolites({
|
||||
atp_c: -1.0,
|
||||
adp_c: 1.0,
|
||||
pi_c: 1.0
|
||||
})
|
||||
|
||||
# Add gene-reaction rule
|
||||
reaction.gene_reaction_rule = "(gene1 and gene2) or gene3"
|
||||
|
||||
# Add to model
|
||||
model.add_reactions([reaction])
|
||||
|
||||
# Add boundary reactions
|
||||
model.add_boundary(atp_c, type="exchange")
|
||||
model.add_boundary(adp_c, type="demand")
|
||||
|
||||
# Set objective
|
||||
model.objective = "ATPASE"
|
||||
```
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Workflow 1: Load Model and Predict Growth
|
||||
|
||||
```python
|
||||
from cobra.io import load_model
|
||||
|
||||
# Load model
|
||||
model = load_model("ecoli")
|
||||
|
||||
# Run FBA
|
||||
solution = model.optimize()
|
||||
print(f"Growth rate: {solution.objective_value:.3f} /h")
|
||||
|
||||
# Show active pathways
|
||||
print(solution.fluxes[solution.fluxes.abs() > 1e-6])
|
||||
```
|
||||
|
||||
### Workflow 2: Gene Knockout Screen
|
||||
|
||||
```python
|
||||
from cobra.io import load_model
|
||||
from cobra.flux_analysis import single_gene_deletion
|
||||
|
||||
# Load model
|
||||
model = load_model("ecoli")
|
||||
|
||||
# Perform single gene deletions
|
||||
results = single_gene_deletion(model)
|
||||
|
||||
# Find essential genes (growth < threshold)
|
||||
essential_genes = results[results["growth"] < 0.01]
|
||||
print(f"Found {len(essential_genes)} essential genes")
|
||||
|
||||
# Find genes with minimal impact
|
||||
neutral_genes = results[results["growth"] > 0.9 * solution.objective_value]
|
||||
```
|
||||
|
||||
### Workflow 3: Media Optimization
|
||||
|
||||
```python
|
||||
from cobra.io import load_model
|
||||
from cobra.medium import minimal_medium
|
||||
|
||||
# Load model
|
||||
model = load_model("ecoli")
|
||||
|
||||
# Calculate minimal medium for 50% of max growth
|
||||
target_growth = model.slim_optimize() * 0.5
|
||||
min_medium = minimal_medium(
|
||||
model,
|
||||
target_growth,
|
||||
minimize_components=True
|
||||
)
|
||||
|
||||
print(f"Minimal medium components: {len(min_medium)}")
|
||||
print(min_medium)
|
||||
```
|
||||
|
||||
### Workflow 4: Flux Uncertainty Analysis
|
||||
|
||||
```python
|
||||
from cobra.io import load_model
|
||||
from cobra.flux_analysis import flux_variability_analysis
|
||||
from cobra.sampling import sample
|
||||
|
||||
# Load model
|
||||
model = load_model("ecoli")
|
||||
|
||||
# First check flux ranges at optimality
|
||||
fva = flux_variability_analysis(model, fraction_of_optimum=1.0)
|
||||
|
||||
# For reactions with large ranges, sample to understand distribution
|
||||
samples = sample(model, n=1000)
|
||||
|
||||
# Analyze specific reaction
|
||||
reaction_id = "PFK"
|
||||
import matplotlib.pyplot as plt
|
||||
samples[reaction_id].hist(bins=50)
|
||||
plt.xlabel(f"Flux through {reaction_id}")
|
||||
plt.ylabel("Frequency")
|
||||
plt.show()
|
||||
```
|
||||
|
||||
### Workflow 5: Context Manager for Temporary Changes
|
||||
|
||||
Use context managers to make temporary modifications:
|
||||
```python
|
||||
# Model remains unchanged outside context
|
||||
with model:
|
||||
# Temporarily change objective
|
||||
model.objective = "ATPM"
|
||||
|
||||
# Temporarily modify bounds
|
||||
model.reactions.EX_glc__D_e.lower_bound = -5.0
|
||||
|
||||
# Temporarily knock out genes
|
||||
model.genes.b0008.knock_out()
|
||||
|
||||
# Optimize with changes
|
||||
solution = model.optimize()
|
||||
print(f"Modified growth: {solution.objective_value}")
|
||||
|
||||
# All changes automatically reverted
|
||||
solution = model.optimize()
|
||||
print(f"Original growth: {solution.objective_value}")
|
||||
```
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### DictList Objects
|
||||
Models use `DictList` objects for reactions, metabolites, and genes - behaving like both lists and dictionaries:
|
||||
```python
|
||||
# Access by index
|
||||
first_reaction = model.reactions[0]
|
||||
|
||||
# Access by ID
|
||||
pfk = model.reactions.get_by_id("PFK")
|
||||
|
||||
# Query methods
|
||||
atp_reactions = model.reactions.query("atp")
|
||||
```
|
||||
|
||||
### Flux Constraints
|
||||
Reaction bounds define feasible flux ranges:
|
||||
- **Irreversible**: `lower_bound = 0, upper_bound > 0`
|
||||
- **Reversible**: `lower_bound < 0, upper_bound > 0`
|
||||
- Set both bounds simultaneously with `.bounds` to avoid inconsistencies
|
||||
|
||||
### Gene-Reaction Rules (GPR)
|
||||
Boolean logic linking genes to reactions:
|
||||
```python
|
||||
# AND logic (both required)
|
||||
reaction.gene_reaction_rule = "gene1 and gene2"
|
||||
|
||||
# OR logic (either sufficient)
|
||||
reaction.gene_reaction_rule = "gene1 or gene2"
|
||||
|
||||
# Complex logic
|
||||
reaction.gene_reaction_rule = "(gene1 and gene2) or (gene3 and gene4)"
|
||||
```
|
||||
|
||||
### Exchange Reactions
|
||||
Special reactions representing metabolite import/export:
|
||||
- Named with prefix `EX_` by convention
|
||||
- Positive flux = secretion, negative flux = uptake
|
||||
- Managed through `model.medium` dictionary
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use context managers** for temporary modifications to avoid state management issues
|
||||
2. **Validate models** before analysis using `model.slim_optimize()` to ensure feasibility
|
||||
3. **Check solution status** after optimization - `optimal` indicates successful solve
|
||||
4. **Use loopless FVA** when thermodynamic feasibility matters
|
||||
5. **Set fraction_of_optimum** appropriately in FVA to explore suboptimal space
|
||||
6. **Parallelize** computationally expensive operations (sampling, double deletions)
|
||||
7. **Prefer SBML format** for model exchange and long-term storage
|
||||
8. **Use slim_optimize()** when only objective value needed for performance
|
||||
9. **Validate flux samples** to ensure numerical stability
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Infeasible solutions**: Check medium constraints, reaction bounds, and model consistency
|
||||
**Slow optimization**: Try different solvers (GLPK, CPLEX, Gurobi) via `model.solver`
|
||||
**Unbounded solutions**: Verify exchange reactions have appropriate upper bounds
|
||||
**Import errors**: Ensure correct file format and valid SBML identifiers
|
||||
|
||||
## References
|
||||
|
||||
For detailed workflows and API patterns, refer to:
|
||||
- `references/workflows.md` - Comprehensive step-by-step workflow examples
|
||||
- `references/api_quick_reference.md` - Common function signatures and patterns
|
||||
|
||||
Official documentation: https://cobrapy.readthedocs.io/en/latest/
|
||||
Reference in New Issue
Block a user