Files
gh-k-dense-ai-claude-scient…/skills/scvi-tools/references/models-spatial.md
2025-11-30 08:30:10 +08:00

439 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Spatial Transcriptomics Models
This document covers models for analyzing spatially-resolved transcriptomics data in scvi-tools.
## DestVI (Deconvolution of Spatial Transcriptomics using Variational Inference)
**Purpose**: Multi-resolution deconvolution of spatial transcriptomics using single-cell reference data.
**Key Features**:
- Estimates cell type proportions at each spatial location
- Uses single-cell RNA-seq reference for deconvolution
- Multi-resolution approach (global and local patterns)
- Accounts for spatial correlation
- Provides uncertainty quantification
**When to Use**:
- Deconvolving Visium or similar spatial transcriptomics
- Have scRNA-seq reference data with cell type labels
- Want to map cell types to spatial locations
- Interested in spatial organization of cell types
- Need probabilistic estimates of cell type abundance
**Data Requirements**:
- **Spatial data**: Visium or similar spot-based measurements (target data)
- **Single-cell reference**: scRNA-seq with cell type annotations
- Both datasets should share genes
**Basic Usage**:
```python
import scvi
# Step 1: Train scVI on single-cell reference
scvi.model.SCVI.setup_anndata(sc_adata, layer="counts")
sc_model = scvi.model.SCVI(sc_adata)
sc_model.train()
# Step 2: Setup spatial data
scvi.model.DESTVI.setup_anndata(
spatial_adata,
layer="counts"
)
# Step 3: Train DestVI using reference
model = scvi.model.DESTVI.from_rna_model(
spatial_adata,
sc_model,
cell_type_key="cell_type" # Cell type labels in reference
)
model.train(max_epochs=2500)
# Step 4: Get cell type proportions
proportions = model.get_proportions()
spatial_adata.obsm["proportions"] = proportions
# Step 5: Get cell type-specific expression
# Expression of genes specific to each cell type at each spot
ct_expression = model.get_scale_for_ct("T cells")
```
**Key Parameters**:
- `amortization`: Amortization strategy ("both", "latent", "proportion")
- `n_latent`: Latent dimensionality (inherited from scVI model)
**Outputs**:
- `get_proportions()`: Cell type proportions at each spot
- `get_scale_for_ct(cell_type)`: Cell type-specific expression patterns
- `get_gamma()`: Proportion-specific gene expression scaling
**Visualization**:
```python
import scanpy as sc
import matplotlib.pyplot as plt
# Visualize specific cell type proportions spatially
sc.pl.spatial(
spatial_adata,
color="T cells", # If proportions added to .obs
spot_size=150
)
# Or use obsm directly
for ct in cell_types:
plt.figure()
sc.pl.spatial(
spatial_adata,
color=spatial_adata.obsm["proportions"][ct],
title=f"{ct} proportions"
)
```
## Stereoscope
**Purpose**: Cell type deconvolution for spatial transcriptomics using probabilistic modeling.
**Key Features**:
- Reference-based deconvolution
- Probabilistic framework for cell type proportions
- Works with various spatial technologies
- Handles gene selection and normalization
**When to Use**:
- Similar to DestVI but simpler approach
- Deconvolving spatial data with reference
- Faster alternative for basic deconvolution
**Basic Usage**:
```python
scvi.model.STEREOSCOPE.setup_anndata(
sc_adata,
labels_key="cell_type",
layer="counts"
)
# Train on reference
ref_model = scvi.model.STEREOSCOPE(sc_adata)
ref_model.train()
# Setup spatial data
scvi.model.STEREOSCOPE.setup_anndata(spatial_adata, layer="counts")
# Transfer to spatial
spatial_model = scvi.model.STEREOSCOPE.from_reference_model(
spatial_adata,
ref_model
)
spatial_model.train()
# Get proportions
proportions = spatial_model.get_proportions()
```
## Tangram
**Purpose**: Spatial mapping and integration of single-cell data to spatial locations.
**Key Features**:
- Maps single cells to spatial coordinates
- Learns optimal transport between single-cell and spatial data
- Gene imputation at spatial locations
- Cell type mapping
**When to Use**:
- Mapping cells from scRNA-seq to spatial locations
- Imputing unmeasured genes in spatial data
- Understanding spatial organization at single-cell resolution
- Integrating scRNA-seq and spatial transcriptomics
**Data Requirements**:
- Single-cell RNA-seq data with annotations
- Spatial transcriptomics data
- Shared genes between modalities
**Basic Usage**:
```python
import tangram as tg
# Map cells to spatial locations
ad_map = tg.map_cells_to_space(
adata_sc=sc_adata,
adata_sp=spatial_adata,
mode="cells", # or "clusters" for cell type mapping
density_prior="rna_count_based"
)
# Get mapping matrix (cells × spots)
mapping = ad_map.X
# Project cell annotations to space
tg.project_cell_annotations(
ad_map,
spatial_adata,
annotation="cell_type"
)
# Impute genes in spatial data
genes_to_impute = ["CD3D", "CD8A", "CD4"]
tg.project_genes(ad_map, spatial_adata, genes=genes_to_impute)
```
**Visualization**:
```python
# Visualize cell type mapping
sc.pl.spatial(
spatial_adata,
color="cell_type_projected",
spot_size=100
)
```
## gimVI (Gaussian Identity Multivi for Imputation)
**Purpose**: Cross-modality imputation between spatial and single-cell data.
**Key Features**:
- Joint model of spatial and single-cell data
- Imputes missing genes in spatial data
- Enables cross-dataset queries
- Learns shared representations
**When to Use**:
- Imputing genes not measured in spatial data
- Joint analysis of spatial and single-cell datasets
- Mapping between modalities
**Basic Usage**:
```python
# Combine datasets
combined_adata = sc.concat([sc_adata, spatial_adata])
scvi.model.GIMVI.setup_anndata(
combined_adata,
layer="counts"
)
model = scvi.model.GIMVI(combined_adata)
model.train()
# Impute genes in spatial data
imputed = model.get_imputed_values(spatial_indices)
```
## scVIVA (Variation in Variational Autoencoders for Spatial)
**Purpose**: Analyzing cell-environment relationships in spatial data.
**Key Features**:
- Models cellular neighborhoods and environments
- Identifies environment-associated gene expression
- Accounts for spatial correlation structure
- Cell-cell interaction analysis
**When to Use**:
- Understanding how spatial context affects cells
- Identifying niche-specific gene programs
- Cell-cell interaction studies
- Microenvironment analysis
**Data Requirements**:
- Spatial transcriptomics with coordinates
- Cell type annotations (optional)
**Basic Usage**:
```python
scvi.model.SCVIVA.setup_anndata(
spatial_adata,
layer="counts",
spatial_key="spatial" # Coordinates in .obsm
)
model = scvi.model.SCVIVA(spatial_adata)
model.train()
# Get environment representations
env_latent = model.get_environment_representation()
# Identify environment-associated genes
env_genes = model.get_environment_specific_genes()
```
## ResolVI
**Purpose**: Addressing spatial transcriptomics noise through resolution-aware modeling.
**Key Features**:
- Accounts for spatial resolution effects
- Denoises spatial data
- Multi-scale analysis
- Improves downstream analysis quality
**When to Use**:
- Noisy spatial data
- Multiple spatial resolutions
- Need denoising before analysis
- Improving data quality
**Basic Usage**:
```python
scvi.model.RESOLVI.setup_anndata(
spatial_adata,
layer="counts",
spatial_key="spatial"
)
model = scvi.model.RESOLVI(spatial_adata)
model.train()
# Get denoised expression
denoised = model.get_denoised_expression()
```
## Model Selection for Spatial Transcriptomics
### DestVI
**Choose when**:
- Need detailed deconvolution with reference
- Have high-quality scRNA-seq reference
- Want multi-resolution analysis
- Need uncertainty quantification
**Best for**: Visium, spot-based technologies
### Stereoscope
**Choose when**:
- Need simpler, faster deconvolution
- Basic cell type proportion estimates
- Limited computational resources
**Best for**: Quick deconvolution tasks
### Tangram
**Choose when**:
- Want single-cell resolution mapping
- Need to impute many genes
- Interested in cell positioning
- Optimal transport approach preferred
**Best for**: Detailed spatial mapping
### gimVI
**Choose when**:
- Need bidirectional imputation
- Joint modeling of spatial and single-cell
- Cross-dataset queries
**Best for**: Integration and imputation
### scVIVA
**Choose when**:
- Interested in cellular environments
- Cell-cell interaction analysis
- Neighborhood effects
**Best for**: Microenvironment studies
### ResolVI
**Choose when**:
- Data quality is a concern
- Need denoising
- Multi-scale analysis
**Best for**: Noisy data preprocessing
## Complete Workflow: Spatial Deconvolution with DestVI
```python
import scvi
import scanpy as sc
import squidpy as sq
# ===== Part 1: Prepare single-cell reference =====
# Load and process scRNA-seq reference
sc_adata = sc.read_h5ad("reference_scrna.h5ad")
# QC and filtering
sc.pp.filter_genes(sc_adata, min_cells=10)
sc.pp.highly_variable_genes(sc_adata, n_top_genes=4000)
# Train scVI on reference
scvi.model.SCVI.setup_anndata(
sc_adata,
layer="counts",
batch_key="batch"
)
sc_model = scvi.model.SCVI(sc_adata)
sc_model.train(max_epochs=400)
# ===== Part 2: Load spatial data =====
spatial_adata = sc.read_visium("path/to/visium")
spatial_adata.var_names_make_unique()
# QC spatial data
sc.pp.filter_genes(spatial_adata, min_cells=10)
# ===== Part 3: Run DestVI =====
scvi.model.DESTVI.setup_anndata(
spatial_adata,
layer="counts"
)
destvi_model = scvi.model.DESTVI.from_rna_model(
spatial_adata,
sc_model,
cell_type_key="cell_type"
)
destvi_model.train(max_epochs=2500)
# ===== Part 4: Extract results =====
# Get proportions
proportions = destvi_model.get_proportions()
spatial_adata.obsm["proportions"] = proportions
# Add proportions to .obs for easy plotting
for i, ct in enumerate(sc_model.adata.obs["cell_type"].cat.categories):
spatial_adata.obs[f"prop_{ct}"] = proportions[:, i]
# ===== Part 5: Visualization =====
# Plot specific cell types
cell_types = ["T cells", "B cells", "Macrophages"]
for ct in cell_types:
sc.pl.spatial(
spatial_adata,
color=f"prop_{ct}",
title=f"{ct} proportions",
spot_size=150,
cmap="viridis"
)
# ===== Part 6: Spatial analysis =====
# Compute spatial neighbors
sq.gr.spatial_neighbors(spatial_adata)
# Spatial autocorrelation of cell types
for ct in cell_types:
sq.gr.spatial_autocorr(
spatial_adata,
attr="obs",
mode="moran",
genes=[f"prop_{ct}"]
)
# ===== Part 7: Save results =====
destvi_model.save("destvi_model")
spatial_adata.write("spatial_deconvolved.h5ad")
```
## Best Practices for Spatial Analysis
1. **Reference quality**: Use high-quality, well-annotated scRNA-seq reference
2. **Gene overlap**: Ensure sufficient shared genes between reference and spatial
3. **Spatial coordinates**: Properly register spatial coordinates in `.obsm["spatial"]`
4. **Validation**: Use known marker genes to validate deconvolution
5. **Visualization**: Always visualize results spatially to check biological plausibility
6. **Cell type granularity**: Consider appropriate cell type resolution
7. **Computational resources**: Spatial models can be memory-intensive
8. **Quality control**: Filter low-quality spots before analysis