9.3 KiB
Resource Configuration
Overview
Latch SDK provides flexible resource configuration for workflow tasks, enabling efficient execution on appropriate compute infrastructure including CPU, GPU, and memory-optimized instances.
Task Resource Decorators
Standard Decorators
The SDK provides pre-configured task decorators for common resource requirements:
@small_task
Default configuration for lightweight tasks:
from latch import small_task
@small_task
def lightweight_processing():
"""Minimal resource requirements"""
pass
Use cases:
- File parsing and manipulation
- Simple data transformations
- Quick QC checks
- Metadata operations
@large_task
Increased CPU and memory for intensive computations:
from latch import large_task
@large_task
def intensive_computation():
"""Higher CPU and memory allocation"""
pass
Use cases:
- Large file processing
- Complex statistical analyses
- Assembly tasks
- Multi-threaded operations
@small_gpu_task
GPU-enabled with minimal resources:
from latch import small_gpu_task
@small_gpu_task
def gpu_inference():
"""GPU-enabled task with basic resources"""
pass
Use cases:
- Neural network inference
- Small-scale ML predictions
- GPU-accelerated libraries
@large_gpu_task
GPU-enabled with maximum resources:
from latch import large_gpu_task
@large_gpu_task
def gpu_training():
"""GPU with maximum CPU and memory"""
pass
Use cases:
- Deep learning model training
- Protein structure prediction (AlphaFold)
- Large-scale GPU computations
Custom Task Configuration
For precise control, use the @custom_task decorator:
from latch import custom_task
from latch.resources.tasks import TaskResources
@custom_task(
cpu=8,
memory=32, # GB
storage_gib=100,
timeout=3600, # seconds
)
def custom_processing():
"""Task with custom resource specifications"""
pass
Custom Task Parameters
- cpu: Number of CPU cores (integer)
- memory: Memory in GB (integer)
- storage_gib: Ephemeral storage in GiB (integer)
- timeout: Maximum execution time in seconds (integer)
- gpu: Number of GPUs (integer, 0 for CPU-only)
- gpu_type: Specific GPU model (string, e.g., "nvidia-tesla-v100")
Advanced Custom Configuration
from latch.resources.tasks import TaskResources
@custom_task(
cpu=16,
memory=64,
storage_gib=500,
timeout=7200,
gpu=1,
gpu_type="nvidia-tesla-a100"
)
def alphafold_prediction():
"""AlphaFold with A100 GPU and high memory"""
pass
GPU Configuration
GPU Types
Available GPU options:
- nvidia-tesla-k80: Basic GPU for testing
- nvidia-tesla-v100: High-performance for training
- nvidia-tesla-a100: Latest generation for maximum performance
Multi-GPU Tasks
from latch import custom_task
@custom_task(
cpu=32,
memory=128,
gpu=4,
gpu_type="nvidia-tesla-v100"
)
def multi_gpu_training():
"""Distributed training across multiple GPUs"""
pass
Resource Selection Strategies
By Computational Requirements
Memory-Intensive Tasks:
@custom_task(cpu=4, memory=128) # High memory, moderate CPU
def genome_assembly():
pass
CPU-Intensive Tasks:
@custom_task(cpu=64, memory=32) # High CPU, moderate memory
def parallel_alignment():
pass
I/O-Intensive Tasks:
@custom_task(cpu=8, memory=16, storage_gib=1000) # Large ephemeral storage
def data_preprocessing():
pass
By Workflow Phase
Quick Validation:
@small_task
def validate_inputs():
"""Fast input validation"""
pass
Main Computation:
@large_task
def primary_analysis():
"""Resource-intensive analysis"""
pass
Result Aggregation:
@small_task
def aggregate_results():
"""Lightweight result compilation"""
pass
Workflow Resource Planning
Complete Pipeline Example
from latch import workflow, small_task, large_task, large_gpu_task
from latch.types import LatchFile
@small_task
def quality_control(fastq: LatchFile) -> LatchFile:
"""QC doesn't need much resources"""
return qc_output
@large_task
def alignment(fastq: LatchFile) -> LatchFile:
"""Alignment benefits from more CPU"""
return bam_output
@large_gpu_task
def variant_calling(bam: LatchFile) -> LatchFile:
"""GPU-accelerated variant caller"""
return vcf_output
@small_task
def generate_report(vcf: LatchFile) -> LatchFile:
"""Simple report generation"""
return report
@workflow
def genomics_pipeline(input_fastq: LatchFile) -> LatchFile:
"""Resource-optimized genomics pipeline"""
qc = quality_control(fastq=input_fastq)
aligned = alignment(fastq=qc)
variants = variant_calling(bam=aligned)
return generate_report(vcf=variants)
Timeout Configuration
Setting Timeouts
from latch import custom_task
@custom_task(
cpu=8,
memory=32,
timeout=10800 # 3 hours in seconds
)
def long_running_analysis():
"""Analysis with extended timeout"""
pass
Timeout Best Practices
- Estimate conservatively: Add buffer time beyond expected duration
- Monitor actual runtimes: Adjust based on real execution data
- Default timeout: Most tasks have 1-hour default
- Maximum timeout: Check platform limits for very long jobs
Storage Configuration
Ephemeral Storage
Configure temporary storage for intermediate files:
@custom_task(
cpu=8,
memory=32,
storage_gib=500 # 500 GB temporary storage
)
def process_large_dataset():
"""Task with large intermediate files"""
# Ephemeral storage available at /tmp
temp_file = "/tmp/intermediate_data.bam"
pass
Storage Guidelines
- Default storage is typically sufficient for most tasks
- Specify larger storage for tasks with large intermediate files
- Ephemeral storage is cleared after task completion
- Use LatchDir for persistent storage needs
Cost Optimization
Resource Efficiency Tips
- Right-size resources: Don't over-allocate
- Use appropriate decorators: Start with standard decorators
- GPU only when needed: GPU tasks cost more
- Parallel small tasks: Better than one large task
- Monitor usage: Review actual resource utilization
Example: Cost-Effective Design
# INEFFICIENT: All tasks use large resources
@large_task
def validate_input(): # Over-provisioned
pass
@large_task
def simple_transformation(): # Over-provisioned
pass
# EFFICIENT: Right-sized resources
@small_task
def validate_input(): # Appropriate
pass
@small_task
def simple_transformation(): # Appropriate
pass
@large_task
def intensive_analysis(): # Appropriate
pass
Monitoring and Debugging
Resource Usage Monitoring
During workflow execution, monitor:
- CPU utilization
- Memory usage
- GPU utilization (if applicable)
- Execution duration
- Storage consumption
Common Resource Issues
Out of Memory (OOM):
# Solution: Increase memory allocation
@custom_task(cpu=8, memory=64) # Increased from 32 to 64 GB
def memory_intensive_task():
pass
Timeout:
# Solution: Increase timeout
@custom_task(cpu=8, memory=32, timeout=14400) # 4 hours
def long_task():
pass
Insufficient Storage:
# Solution: Increase ephemeral storage
@custom_task(cpu=8, memory=32, storage_gib=1000)
def large_intermediate_files():
pass
Conditional Resources
Dynamically allocate resources based on input:
from latch import workflow, custom_task
from latch.types import LatchFile
def get_resource_config(file_size_gb: float):
"""Determine resources based on file size"""
if file_size_gb < 10:
return {"cpu": 4, "memory": 16}
elif file_size_gb < 100:
return {"cpu": 16, "memory": 64}
else:
return {"cpu": 32, "memory": 128}
# Note: Resource decorators must be static
# Use multiple task variants for different sizes
@custom_task(cpu=4, memory=16)
def process_small(file: LatchFile) -> LatchFile:
pass
@custom_task(cpu=16, memory=64)
def process_medium(file: LatchFile) -> LatchFile:
pass
@custom_task(cpu=32, memory=128)
def process_large(file: LatchFile) -> LatchFile:
pass
Best Practices Summary
- Start small: Begin with standard decorators, scale up if needed
- Profile first: Run test executions to determine actual needs
- GPU sparingly: Only use GPU when algorithms support it
- Parallel design: Break into smaller tasks when possible
- Monitor and adjust: Review execution metrics and optimize
- Document requirements: Comment why specific resources are needed
- Test locally: Use Docker locally to validate before registration
- Consider cost: Balance performance with cost efficiency
Platform-Specific Considerations
Available Resources
Latch platform provides:
- CPU instances: Up to 96 cores
- Memory: Up to 768 GB
- GPUs: K80, V100, A100 variants
- Storage: Configurable ephemeral storage
Resource Limits
Check current platform limits:
- Maximum CPUs per task
- Maximum memory per task
- Maximum GPU allocation
- Maximum concurrent tasks
Quotas and Limits
Be aware of workspace quotas:
- Total concurrent executions
- Total resource allocation
- Storage limits
- Execution time limits
Contact Latch support for quota increases if needed.