Initial commit

2025-11-30 08:30:10 +08:00
commit f0bd18fb4e
824 changed files with 331919 additions and 0 deletions
--- a/skills/benchling-integration/SKILL.md
+++ b/skills/benchling-integration/SKILL.md
@@ -0,0 +1,473 @@
+---
+name: benchling-integration
+description: "Benchling R&D platform integration. Access registry (DNA, proteins), inventory, ELN entries, workflows via API, build Benchling Apps, query Data Warehouse, for lab data management automation."
+---
+
+# Benchling Integration
+
+## Overview
+
+Benchling is a cloud platform for life sciences R&D. Access registry entities (DNA, proteins), inventory, electronic lab notebooks, and workflows programmatically via Python SDK and REST API.
+
+## When to Use This Skill
+
+This skill should be used when:
+- Working with Benchling's Python SDK or REST API
+- Managing biological sequences (DNA, RNA, proteins) and registry entities
+- Automating inventory operations (samples, containers, locations, transfers)
+- Creating or querying electronic lab notebook entries
+- Building workflow automations or Benchling Apps
+- Syncing data between Benchling and external systems
+- Querying the Benchling Data Warehouse for analytics
+- Setting up event-driven integrations with AWS EventBridge
+
+## Core Capabilities
+
+### 1. Authentication & Setup
+
+**Python SDK Installation:**
+```python
+# Stable release
+uv pip install benchling-sdk
+# or with Poetry
+poetry add benchling-sdk
+```
+
+**Authentication Methods:**
+
+API Key Authentication (recommended for scripts):
+```python
+from benchling_sdk.benchling import Benchling
+from benchling_sdk.auth.api_key_auth import ApiKeyAuth
+
+benchling = Benchling(
+    url="https://your-tenant.benchling.com",
+    auth_method=ApiKeyAuth("your_api_key")
+)
+```
+
+OAuth Client Credentials (for apps):
+```python
+from benchling_sdk.auth.client_credentials_oauth2 import ClientCredentialsOAuth2
+
+auth_method = ClientCredentialsOAuth2(
+    client_id="your_client_id",
+    client_secret="your_client_secret"
+)
+benchling = Benchling(
+    url="https://your-tenant.benchling.com",
+    auth_method=auth_method
+)
+```
+
+**Key Points:**
+- API keys are obtained from Profile Settings in Benchling
+- Store credentials securely (use environment variables or password managers)
+- All API requests require HTTPS
+- Authentication permissions mirror user permissions in the UI
+
+For detailed authentication information including OIDC and security best practices, refer to `references/authentication.md`.
+
+### 2. Registry & Entity Management
+
+Registry entities include DNA sequences, RNA sequences, AA sequences, custom entities, and mixtures. The SDK provides typed classes for creating and managing these entities.
+
+**Creating DNA Sequences:**
+```python
+from benchling_sdk.models import DnaSequenceCreate
+
+sequence = benchling.dna_sequences.create(
+    DnaSequenceCreate(
+        name="My Plasmid",
+        bases="ATCGATCG",
+        is_circular=True,
+        folder_id="fld_abc123",
+        schema_id="ts_abc123",  # optional
+        fields=benchling.models.fields({"gene_name": "GFP"})
+    )
+)
+```
+
+**Registry Registration:**
+
+To register an entity directly upon creation:
+```python
+sequence = benchling.dna_sequences.create(
+    DnaSequenceCreate(
+        name="My Plasmid",
+        bases="ATCGATCG",
+        is_circular=True,
+        folder_id="fld_abc123",
+        entity_registry_id="src_abc123",  # Registry to register in
+        naming_strategy="NEW_IDS"  # or "IDS_FROM_NAMES"
+    )
+)
+```
+
+**Important:** Use either `entity_registry_id` OR `naming_strategy`, never both.
+
+**Updating Entities:**
+```python
+from benchling_sdk.models import DnaSequenceUpdate
+
+updated = benchling.dna_sequences.update(
+    sequence_id="seq_abc123",
+    dna_sequence=DnaSequenceUpdate(
+        name="Updated Plasmid Name",
+        fields=benchling.models.fields({"gene_name": "mCherry"})
+    )
+)
+```
+
+Unspecified fields remain unchanged, allowing partial updates.
+
+**Listing and Pagination:**
+```python
+# List all DNA sequences (returns a generator)
+sequences = benchling.dna_sequences.list()
+for page in sequences:
+    for seq in page:
+        print(f"{seq.name} ({seq.id})")
+
+# Check total count
+total = sequences.estimated_count()
+```
+
+**Key Operations:**
+- Create: `benchling.<entity_type>.create()`
+- Read: `benchling.<entity_type>.get(id)` or `.list()`
+- Update: `benchling.<entity_type>.update(id, update_object)`
+- Archive: `benchling.<entity_type>.archive(id)`
+
+Entity types: `dna_sequences`, `rna_sequences`, `aa_sequences`, `custom_entities`, `mixtures`
+
+For comprehensive SDK reference and advanced patterns, refer to `references/sdk_reference.md`.
+
+### 3. Inventory Management
+
+Manage physical samples, containers, boxes, and locations within the Benchling inventory system.
+
+**Creating Containers:**
+```python
+from benchling_sdk.models import ContainerCreate
+
+container = benchling.containers.create(
+    ContainerCreate(
+        name="Sample Tube 001",
+        schema_id="cont_schema_abc123",
+        parent_storage_id="box_abc123",  # optional
+        fields=benchling.models.fields({"concentration": "100 ng/μL"})
+    )
+)
+```
+
+**Managing Boxes:**
+```python
+from benchling_sdk.models import BoxCreate
+
+box = benchling.boxes.create(
+    BoxCreate(
+        name="Freezer Box A1",
+        schema_id="box_schema_abc123",
+        parent_storage_id="loc_abc123"
+    )
+)
+```
+
+**Transferring Items:**
+```python
+# Transfer a container to a new location
+transfer = benchling.containers.transfer(
+    container_id="cont_abc123",
+    destination_id="box_xyz789"
+)
+```
+
+**Key Inventory Operations:**
+- Create containers, boxes, locations, plates
+- Update inventory item properties
+- Transfer items between locations
+- Check in/out items
+- Batch operations for bulk transfers
+
+### 4. Notebook & Documentation
+
+Interact with electronic lab notebook (ELN) entries, protocols, and templates.
+
+**Creating Notebook Entries:**
+```python
+from benchling_sdk.models import EntryCreate
+
+entry = benchling.entries.create(
+    EntryCreate(
+        name="Experiment 2025-10-20",
+        folder_id="fld_abc123",
+        schema_id="entry_schema_abc123",
+        fields=benchling.models.fields({"objective": "Test gene expression"})
+    )
+)
+```
+
+**Linking Entities to Entries:**
+```python
+# Add references to entities in an entry
+entry_link = benchling.entry_links.create(
+    entry_id="entry_abc123",
+    entity_id="seq_xyz789"
+)
+```
+
+**Key Notebook Operations:**
+- Create and update lab notebook entries
+- Manage entry templates
+- Link entities and results to entries
+- Export entries for documentation
+
+### 5. Workflows & Automation
+
+Automate laboratory processes using Benchling's workflow system.
+
+**Creating Workflow Tasks:**
+```python
+from benchling_sdk.models import WorkflowTaskCreate
+
+task = benchling.workflow_tasks.create(
+    WorkflowTaskCreate(
+        name="PCR Amplification",
+        workflow_id="wf_abc123",
+        assignee_id="user_abc123",
+        fields=benchling.models.fields({"template": "seq_abc123"})
+    )
+)
+```
+
+**Updating Task Status:**
+```python
+from benchling_sdk.models import WorkflowTaskUpdate
+
+updated_task = benchling.workflow_tasks.update(
+    task_id="task_abc123",
+    workflow_task=WorkflowTaskUpdate(
+        status_id="status_complete_abc123"
+    )
+)
+```
+
+**Asynchronous Operations:**
+
+Some operations are asynchronous and return tasks:
+```python
+# Wait for task completion
+from benchling_sdk.helpers.tasks import wait_for_task
+
+result = wait_for_task(
+    benchling,
+    task_id="task_abc123",
+    interval_wait_seconds=2,
+    max_wait_seconds=300
+)
+```
+
+**Key Workflow Operations:**
+- Create and manage workflow tasks
+- Update task statuses and assignments
+- Execute bulk operations asynchronously
+- Monitor task progress
+
+### 6. Events & Integration
+
+Subscribe to Benchling events for real-time integrations using AWS EventBridge.
+
+**Event Types:**
+- Entity creation, update, archive
+- Inventory transfers
+- Workflow task status changes
+- Entry creation and updates
+- Results registration
+
+**Integration Pattern:**
+1. Configure event routing to AWS EventBridge in Benchling settings
+2. Create EventBridge rules to filter events
+3. Route events to Lambda functions or other targets
+4. Process events and update external systems
+
+**Use Cases:**
+- Sync Benchling data to external databases
+- Trigger downstream processes on workflow completion
+- Send notifications on entity changes
+- Audit trail logging
+
+Refer to Benchling's event documentation for event schemas and configuration.
+
+### 7. Data Warehouse & Analytics
+
+Query historical Benchling data using SQL through the Data Warehouse.
+
+**Access Method:**
+The Benchling Data Warehouse provides SQL access to Benchling data for analytics and reporting. Connect using standard SQL clients with provided credentials.
+
+**Common Queries:**
+- Aggregate experimental results
+- Analyze inventory trends
+- Generate compliance reports
+- Export data for external analysis
+
+**Integration with Analysis Tools:**
+- Jupyter notebooks for interactive analysis
+- BI tools (Tableau, Looker, PowerBI)
+- Custom dashboards
+
+## Best Practices
+
+### Error Handling
+
+The SDK automatically retries failed requests:
+```python
+# Automatic retry for 429, 502, 503, 504 status codes
+# Up to 5 retries with exponential backoff
+# Customize retry behavior if needed
+from benchling_sdk.retry import RetryStrategy
+
+benchling = Benchling(
+    url="https://your-tenant.benchling.com",
+    auth_method=ApiKeyAuth("your_api_key"),
+    retry_strategy=RetryStrategy(max_retries=3)
+)
+```
+
+### Pagination Efficiency
+
+Use generators for memory-efficient pagination:
+```python
+# Generator-based iteration
+for page in benchling.dna_sequences.list():
+    for sequence in page:
+        process(sequence)
+
+# Check estimated count without loading all pages
+total = benchling.dna_sequences.list().estimated_count()
+```
+
+### Schema Fields Helper
+
+Use the `fields()` helper for custom schema fields:
+```python
+# Convert dict to Fields object
+custom_fields = benchling.models.fields({
+    "concentration": "100 ng/μL",
+    "date_prepared": "2025-10-20",
+    "notes": "High quality prep"
+})
+```
+
+### Forward Compatibility
+
+The SDK handles unknown enum values and types gracefully:
+- Unknown enum values are preserved
+- Unrecognized polymorphic types return `UnknownType`
+- Allows working with newer API versions
+
+### Security Considerations
+
+- Never commit API keys to version control
+- Use environment variables for credentials
+- Rotate keys if compromised
+- Grant minimal necessary permissions for apps
+- Use OAuth for multi-user scenarios
+
+## Resources
+
+### references/
+
+Detailed reference documentation for in-depth information:
+
+- **authentication.md** - Comprehensive authentication guide including OIDC, security best practices, and credential management
+- **sdk_reference.md** - Detailed Python SDK reference with advanced patterns, examples, and all entity types
+- **api_endpoints.md** - REST API endpoint reference for direct HTTP calls without the SDK
+
+Load these references as needed for specific integration requirements.
+
+### scripts/
+
+This skill currently includes example scripts that can be removed or replaced with custom automation scripts for your specific Benchling workflows.
+
+## Common Use Cases
+
+**1. Bulk Entity Import:**
+```python
+# Import multiple sequences from FASTA file
+from Bio import SeqIO
+
+for record in SeqIO.parse("sequences.fasta", "fasta"):
+    benchling.dna_sequences.create(
+        DnaSequenceCreate(
+            name=record.id,
+            bases=str(record.seq),
+            is_circular=False,
+            folder_id="fld_abc123"
+        )
+    )
+```
+
+**2. Inventory Audit:**
+```python
+# List all containers in a specific location
+containers = benchling.containers.list(
+    parent_storage_id="box_abc123"
+)
+
+for page in containers:
+    for container in page:
+        print(f"{container.name}: {container.barcode}")
+```
+
+**3. Workflow Automation:**
+```python
+# Update all pending tasks for a workflow
+tasks = benchling.workflow_tasks.list(
+    workflow_id="wf_abc123",
+    status="pending"
+)
+
+for page in tasks:
+    for task in page:
+        # Perform automated checks
+        if auto_validate(task):
+            benchling.workflow_tasks.update(
+                task_id=task.id,
+                workflow_task=WorkflowTaskUpdate(
+                    status_id="status_complete"
+                )
+            )
+```
+
+**4. Data Export:**
+```python
+# Export all sequences with specific properties
+sequences = benchling.dna_sequences.list()
+export_data = []
+
+for page in sequences:
+    for seq in page:
+        if seq.schema_id == "target_schema_id":
+            export_data.append({
+                "id": seq.id,
+                "name": seq.name,
+                "bases": seq.bases,
+                "length": len(seq.bases)
+            })
+
+# Save to CSV or database
+import csv
+with open("sequences.csv", "w") as f:
+    writer = csv.DictWriter(f, fieldnames=export_data[0].keys())
+    writer.writeheader()
+    writer.writerows(export_data)
+```
+
+## Additional Resources
+
+- **Official Documentation:** https://docs.benchling.com
+- **Python SDK Reference:** https://benchling.com/sdk-docs/
+- **API Reference:** https://benchling.com/api/reference
+- **Support:** [email protected]