5.2 KiB
DNAnexus App Development
Overview
Apps and applets are executable programs that run on the DNAnexus platform. They can be written in Python or Bash and are deployed with all necessary dependencies and configuration.
Applets vs Apps
- Applets: Data objects that live inside projects. Good for development and testing.
- Apps: Versioned, shareable executables that don't live inside projects. Can be published for others to use.
Both are created identically until the final build step. Applets can be converted to apps later.
Creating an App/Applet
Using dx-app-wizard
Generate a skeleton app directory structure:
dx-app-wizard
This creates:
dxapp.json- Configuration filesrc/- Source code directoryresources/- Bundled dependenciestest/- Test files
Building and Deploying
Build an applet:
dx build
Build an app:
dx build --app
The build process:
- Validates dxapp.json configuration
- Bundles source code and resources
- Deploys to the platform
- Returns the applet/app ID
App Directory Structure
my-app/
├── dxapp.json # Metadata and configuration
├── src/
│ └── my-app.py # Main executable (Python)
│ └── my-app.sh # Or Bash script
├── resources/ # Bundled files and dependencies
│ └── tools/
│ └── data/
└── test/ # Test data and scripts
└── test.json
Python App Structure
Entry Points
Python apps use the @dxpy.entry_point() decorator to define functions:
import dxpy
@dxpy.entry_point('main')
def main(input1, input2):
# Process inputs
# Return outputs
return {
"output1": result1,
"output2": result2
}
dxpy.run()
Input/Output Handling
Inputs: DNAnexus data objects are represented as dicts containing links:
@dxpy.entry_point('main')
def main(reads_file):
# Convert link to handler
reads_dxfile = dxpy.DXFile(reads_file)
# Download to local filesystem
dxpy.download_dxfile(reads_dxfile.get_id(), "reads.fastq")
# Process file...
Outputs: Return primitive types directly, convert file outputs to links:
# Upload result file
output_file = dxpy.upload_local_file("output.fastq")
return {
"trimmed_reads": dxpy.dxlink(output_file)
}
Bash App Structure
Bash apps use a simpler shell script approach:
#!/bin/bash
set -e -x -o pipefail
main() {
# Download inputs
dx download "$reads_file" -o reads.fastq
# Process
process_reads reads.fastq > output.fastq
# Upload outputs
trimmed_reads=$(dx upload output.fastq --brief)
# Set job output
dx-jobutil-add-output trimmed_reads "$trimmed_reads" --class=file
}
Common Development Patterns
1. Bioinformatics Pipeline
Download → Process → Upload pattern:
# Download input
dxpy.download_dxfile(input_file_id, "input.fastq")
# Run analysis
subprocess.check_call(["tool", "input.fastq", "output.bam"])
# Upload result
output = dxpy.upload_local_file("output.bam")
return {"aligned_reads": dxpy.dxlink(output)}
2. Multi-file Processing
# Process multiple inputs
for file_link in input_files:
file_handler = dxpy.DXFile(file_link)
local_path = f"{file_handler.name}"
dxpy.download_dxfile(file_handler.get_id(), local_path)
# Process each file...
3. Parallel Processing
Apps can spawn subjobs for parallel execution:
# Create subjobs
subjobs = []
for item in input_list:
subjob = dxpy.new_dxjob(
fn_input={"input": item},
fn_name="process_item"
)
subjobs.append(subjob)
# Collect results
results = [job.get_output_ref("result") for job in subjobs]
Execution Environment
Apps run in isolated Linux VMs (Ubuntu 24.04) with:
- Internet access
- DNAnexus API access
- Temporary scratch space in
/home/dnanexus - Input files downloaded to job workspace
- Root access for installing dependencies
Testing Apps
Local Testing
Test app logic locally before deploying:
cd my-app
python src/my-app.py
Platform Testing
Run the applet on the platform:
dx run applet-xxxx -i input1=file-yyyy
Monitor job execution:
dx watch job-zzzz
View job logs:
dx watch job-zzzz --get-streams
Best Practices
- Error Handling: Use try-except blocks and provide informative error messages
- Logging: Print progress and debug information to stdout/stderr
- Validation: Validate inputs before processing
- Cleanup: Remove temporary files when done
- Documentation: Include clear descriptions in dxapp.json
- Testing: Test with various input types and edge cases
- Versioning: Use semantic versioning for apps
Common Issues
File Not Found
Ensure files are properly downloaded before accessing:
dxpy.download_dxfile(file_id, local_path)
# Now safe to open local_path
Out of Memory
Specify larger instance type in dxapp.json systemRequirements
Timeout
Increase timeout in dxapp.json or split into smaller jobs
Permission Errors
Ensure app has necessary permissions in dxapp.json