zhongwei/gh-k-dense-ai-claude-scientific-skills-scientific-skills

Files

Zhongwei Li f0bd18fb4e Initial commit

2025-11-30 08:30:10 +08:00

9.3 KiB

Raw Blame History

Resource Configuration

Overview

Latch SDK provides flexible resource configuration for workflow tasks, enabling efficient execution on appropriate compute infrastructure including CPU, GPU, and memory-optimized instances.

Task Resource Decorators

Standard Decorators

The SDK provides pre-configured task decorators for common resource requirements:

@small_task

Default configuration for lightweight tasks:

from latch import small_task

@small_task
def lightweight_processing():
    """Minimal resource requirements"""
    pass

Use cases:

File parsing and manipulation
Simple data transformations
Quick QC checks
Metadata operations

@large_task

Increased CPU and memory for intensive computations:

from latch import large_task

@large_task
def intensive_computation():
    """Higher CPU and memory allocation"""
    pass

Use cases:

Large file processing
Complex statistical analyses
Assembly tasks
Multi-threaded operations

@small_gpu_task

GPU-enabled with minimal resources:

from latch import small_gpu_task

@small_gpu_task
def gpu_inference():
    """GPU-enabled task with basic resources"""
    pass

Use cases:

Neural network inference
Small-scale ML predictions
GPU-accelerated libraries

@large_gpu_task

GPU-enabled with maximum resources:

from latch import large_gpu_task

@large_gpu_task
def gpu_training():
    """GPU with maximum CPU and memory"""
    pass

Use cases:

Deep learning model training
Protein structure prediction (AlphaFold)
Large-scale GPU computations

Custom Task Configuration

For precise control, use the @custom_task decorator:

from latch import custom_task
from latch.resources.tasks import TaskResources

@custom_task(
    cpu=8,
    memory=32,  # GB
    storage_gib=100,
    timeout=3600,  # seconds
)
def custom_processing():
    """Task with custom resource specifications"""
    pass

Custom Task Parameters

cpu: Number of CPU cores (integer)
memory: Memory in GB (integer)
storage_gib: Ephemeral storage in GiB (integer)
timeout: Maximum execution time in seconds (integer)
gpu: Number of GPUs (integer, 0 for CPU-only)
gpu_type: Specific GPU model (string, e.g., "nvidia-tesla-v100")

Advanced Custom Configuration

from latch.resources.tasks import TaskResources

@custom_task(
    cpu=16,
    memory=64,
    storage_gib=500,
    timeout=7200,
    gpu=1,
    gpu_type="nvidia-tesla-a100"
)
def alphafold_prediction():
    """AlphaFold with A100 GPU and high memory"""
    pass

GPU Configuration

GPU Types

Available GPU options:

nvidia-tesla-k80: Basic GPU for testing
nvidia-tesla-v100: High-performance for training
nvidia-tesla-a100: Latest generation for maximum performance

Multi-GPU Tasks

from latch import custom_task

@custom_task(
    cpu=32,
    memory=128,
    gpu=4,
    gpu_type="nvidia-tesla-v100"
)
def multi_gpu_training():
    """Distributed training across multiple GPUs"""
    pass

Resource Selection Strategies

By Computational Requirements

Memory-Intensive Tasks:

@custom_task(cpu=4, memory=128)  # High memory, moderate CPU
def genome_assembly():
    pass

CPU-Intensive Tasks:

@custom_task(cpu=64, memory=32)  # High CPU, moderate memory
def parallel_alignment():
    pass

I/O-Intensive Tasks:

@custom_task(cpu=8, memory=16, storage_gib=1000)  # Large ephemeral storage
def data_preprocessing():
    pass

By Workflow Phase

Quick Validation:

@small_task
def validate_inputs():
    """Fast input validation"""
    pass

Main Computation:

@large_task
def primary_analysis():
    """Resource-intensive analysis"""
    pass

Result Aggregation:

@small_task
def aggregate_results():
    """Lightweight result compilation"""
    pass

Workflow Resource Planning

Complete Pipeline Example

from latch import workflow, small_task, large_task, large_gpu_task
from latch.types import LatchFile

@small_task
def quality_control(fastq: LatchFile) -> LatchFile:
    """QC doesn't need much resources"""
    return qc_output

@large_task
def alignment(fastq: LatchFile) -> LatchFile:
    """Alignment benefits from more CPU"""
    return bam_output

@large_gpu_task
def variant_calling(bam: LatchFile) -> LatchFile:
    """GPU-accelerated variant caller"""
    return vcf_output

@small_task
def generate_report(vcf: LatchFile) -> LatchFile:
    """Simple report generation"""
    return report

@workflow
def genomics_pipeline(input_fastq: LatchFile) -> LatchFile:
    """Resource-optimized genomics pipeline"""
    qc = quality_control(fastq=input_fastq)
    aligned = alignment(fastq=qc)
    variants = variant_calling(bam=aligned)
    return generate_report(vcf=variants)

Timeout Configuration

Setting Timeouts

from latch import custom_task

@custom_task(
    cpu=8,
    memory=32,
    timeout=10800  # 3 hours in seconds
)
def long_running_analysis():
    """Analysis with extended timeout"""
    pass

Timeout Best Practices

Estimate conservatively: Add buffer time beyond expected duration
Monitor actual runtimes: Adjust based on real execution data
Default timeout: Most tasks have 1-hour default
Maximum timeout: Check platform limits for very long jobs

Storage Configuration

Ephemeral Storage

Configure temporary storage for intermediate files:

@custom_task(
    cpu=8,
    memory=32,
    storage_gib=500  # 500 GB temporary storage
)
def process_large_dataset():
    """Task with large intermediate files"""
    # Ephemeral storage available at /tmp
    temp_file = "/tmp/intermediate_data.bam"
    pass

Storage Guidelines

Default storage is typically sufficient for most tasks
Specify larger storage for tasks with large intermediate files
Ephemeral storage is cleared after task completion
Use LatchDir for persistent storage needs

Cost Optimization

Resource Efficiency Tips

Right-size resources: Don't over-allocate
Use appropriate decorators: Start with standard decorators
GPU only when needed: GPU tasks cost more
Parallel small tasks: Better than one large task
Monitor usage: Review actual resource utilization

Example: Cost-Effective Design

# INEFFICIENT: All tasks use large resources
@large_task
def validate_input():  # Over-provisioned
    pass

@large_task
def simple_transformation():  # Over-provisioned
    pass

# EFFICIENT: Right-sized resources
@small_task
def validate_input():  # Appropriate
    pass

@small_task
def simple_transformation():  # Appropriate
    pass

@large_task
def intensive_analysis():  # Appropriate
    pass

Monitoring and Debugging

Resource Usage Monitoring

During workflow execution, monitor:

CPU utilization
Memory usage
GPU utilization (if applicable)
Execution duration
Storage consumption

Common Resource Issues

Out of Memory (OOM):

# Solution: Increase memory allocation
@custom_task(cpu=8, memory=64)  # Increased from 32 to 64 GB
def memory_intensive_task():
    pass

Timeout:

# Solution: Increase timeout
@custom_task(cpu=8, memory=32, timeout=14400)  # 4 hours
def long_task():
    pass

Insufficient Storage:

# Solution: Increase ephemeral storage
@custom_task(cpu=8, memory=32, storage_gib=1000)
def large_intermediate_files():
    pass

Conditional Resources

Dynamically allocate resources based on input:

from latch import workflow, custom_task
from latch.types import LatchFile

def get_resource_config(file_size_gb: float):
    """Determine resources based on file size"""
    if file_size_gb < 10:
        return {"cpu": 4, "memory": 16}
    elif file_size_gb < 100:
        return {"cpu": 16, "memory": 64}
    else:
        return {"cpu": 32, "memory": 128}

# Note: Resource decorators must be static
# Use multiple task variants for different sizes
@custom_task(cpu=4, memory=16)
def process_small(file: LatchFile) -> LatchFile:
    pass

@custom_task(cpu=16, memory=64)
def process_medium(file: LatchFile) -> LatchFile:
    pass

@custom_task(cpu=32, memory=128)
def process_large(file: LatchFile) -> LatchFile:
    pass

Best Practices Summary

Start small: Begin with standard decorators, scale up if needed
Profile first: Run test executions to determine actual needs
GPU sparingly: Only use GPU when algorithms support it
Parallel design: Break into smaller tasks when possible
Monitor and adjust: Review execution metrics and optimize
Document requirements: Comment why specific resources are needed
Test locally: Use Docker locally to validate before registration
Consider cost: Balance performance with cost efficiency

Platform-Specific Considerations

Available Resources

Latch platform provides:

CPU instances: Up to 96 cores
Memory: Up to 768 GB
GPUs: K80, V100, A100 variants
Storage: Configurable ephemeral storage

Resource Limits

Check current platform limits:

Maximum CPUs per task
Maximum memory per task
Maximum GPU allocation
Maximum concurrent tasks

Quotas and Limits

Be aware of workspace quotas:

Total concurrent executions
Total resource allocation
Storage limits
Execution time limits

Contact Latch support for quota increases if needed.

9.3 KiB Raw Blame History

Resource Configuration

Overview

Task Resource Decorators

Standard Decorators

@small_task

@large_task

@small_gpu_task

@large_gpu_task

Custom Task Configuration

Custom Task Parameters

Advanced Custom Configuration

GPU Configuration

GPU Types

Multi-GPU Tasks

Resource Selection Strategies

By Computational Requirements

By Workflow Phase

Workflow Resource Planning

Complete Pipeline Example

Timeout Configuration

Setting Timeouts

Timeout Best Practices

Storage Configuration

Ephemeral Storage

Storage Guidelines

Cost Optimization

Resource Efficiency Tips

Example: Cost-Effective Design

Monitoring and Debugging

Resource Usage Monitoring

Common Resource Issues

Conditional Resources

Best Practices Summary

Platform-Specific Considerations

Available Resources

Resource Limits

Quotas and Limits

9.3 KiB

Raw Blame History